Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult
24th November 2025
Anthropic released Claude Opus 4.5 this morning, which they call βbest model in the world for coding, agents, ...
Similar Articles (10 found)
π 80.7% similar
GPT-5.2
11th December 2025
OpenAI reportedly declared a βcode redβ on the 1st of December in response to increasingly credible competition from the li...
π 79.4% similar
GPT-5: Key characteristics, pricing and model card
7th August 2025
Iβve had preview access to the new GPT-5 model family for the past two weeks (see r...
π 77.0% similar
Things we learned about LLMs in 2024
31st December 2024
A lot has happened in the world of Large Language Models over the course of 2024. Hereβs a rev...
π 77.0% similar
Each month, this newsletter is read by over 45K+ operators, investors, and tech / product leaders and executives. If you found value in this newslette...
π 76.0% similar
Olmo 3 is a fully open LLM
22nd November 2025
Olmo is the LLM series from Ai2βthe Allen institute for AI. Unlike most open weight models these are not...
π 73.5% similar
What happens when coding agents stop feeling like dialup?
It's funny how quickly humans adjust to new technology. Only a few months ago Claude Code an...
π 73.4% similar
Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model
20th November 2025
Hot on the heels of Tuesdayβs Gemini 3 ...
π 72.8% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...
π 71.9% similar
Same AI, Different Answer: How Tiny Prompts Can Change Everything
Why Does ChatGPT Sometimes Feel Different?
If youβve used AI chatbots like ChatGPT f...
π 71.7% similar
How will OpenAI compete?
βJakub and Mark set the research direction for the long run. Then after months of work, something incredible emerges and I ge...