Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult
24th November 2025
Anthropic released Claude Opus 4.5 this morning, which they call โbest model in the world for coding, agents, ...
Similar Articles (10 found)
๐ 80.7% similar
GPT-5.2
11th December 2025
OpenAI reportedly declared a โcode redโ on the 1st of December in response to increasingly credible competition from the li...
๐ 79.4% similar
GPT-5: Key characteristics, pricing and model card
7th August 2025
Iโve had preview access to the new GPT-5 model family for the past two weeks (see r...
๐ 77.0% similar
Things we learned about LLMs in 2024
31st December 2024
A lot has happened in the world of Large Language Models over the course of 2024. Hereโs a rev...
๐ 77.0% similar
Each month, this newsletter is read by over 45K+ operators, investors, and tech / product leaders and executives. If you found value in this newslette...
๐ 76.0% similar
Olmo 3 is a fully open LLM
22nd November 2025
Olmo is the LLM series from Ai2โthe Allen institute for AI. Unlike most open weight models these are not...
๐ 73.5% similar
What happens when coding agents stop feeling like dialup?
It's funny how quickly humans adjust to new technology. Only a few months ago Claude Code an...
๐ 73.4% similar
Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model
20th November 2025
Hot on the heels of Tuesdayโs Gemini 3 ...
๐ 72.8% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...
๐ 71.9% similar
Same AI, Different Answer: How Tiny Prompts Can Change Everything
Why Does ChatGPT Sometimes Feel Different?
If youโve used AI chatbots like ChatGPT f...
๐ 71.5% similar
I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in 4.5 hours
15th December 2025
I wrote about JustHTML yesterdayโEmil Stenstrรถm...