Similar Articles

Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult

https://simonwillison.net/2025/Nov/24/claude-opus/#atom-entries

Domain: simonwillison.net Added: 2025-12-18 Status: ✓ Success

simonwillison.net

Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult 24th November 2025 Anthropic released Claude Opus 4.5 this morning, which they call “best model in the world for coding, agents, ...

Similar Articles (10 found)

🔍 80.7% similar

GPT-5.2

https://simonwillison.net/2025/Dec/11/gpt-52/#atom-entries

simonwillison.net 2025-12-18

simonwillison.net

GPT-5.2 11th December 2025 OpenAI reportedly declared a “code red” on the 1st of December in response to increasingly credible competition from the li...

🔍 View Similar Articles

https://simonwillison.net/2025/Aug/7/gpt-5/

simonwillison.net 2025-08-13

simonwillison.net

GPT-5: Key characteristics, pricing and model card 7th August 2025 I’ve had preview access to the new GPT-5 model family for the past two weeks (see r...

🔍 View Similar Articles 🟠 HN

🔍 77.0% similar

Error extracting title

https://simonwillison.net/2024/Dec/31/llms-in-2024/

simonwillison.net 2025-07-12

simonwillison.net

Things we learned about LLMs in 2024 31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a rev...

🔍 View Similar Articles 🟠 HN

🔍 77.0% similar

GPT-5: Strategic Implications

https://nextword.substack.com/p/gpt-5-strategic-implications

nextword.substack.com 2025-08-28

nextword.substack.com

Each month, this newsletter is read by over 45K+ operators, investors, and tech / product leaders and executives. If you found value in this newslette...

🔍 View Similar Articles

🔍 76.0% similar

Olmo 3 is a fully open LLM

https://simonwillison.net/2025/Nov/22/olmo-3/#atom-entries

simonwillison.net 2025-12-18

simonwillison.net

Olmo 3 is a fully open LLM 22nd November 2025 Olmo is the LLM series from Ai2—the Allen institute for AI. Unlike most open weight models these are not...

🔍 View Similar Articles

https://martinalderson.com/posts/what-happens-when-coding-agents-stop-feeling-like-dialup/

martinalderson.com 2025-10-11

martinalderson.com

What happens when coding agents stop feeling like dialup? It's funny how quickly humans adjust to new technology. Only a few months ago Claude Code an...

🔍 View Similar Articles 🟠 HN

https://simonwillison.net/2025/Nov/20/nano-banana-pro/#atom-entries

simonwillison.net 2025-12-18

simonwillison.net

Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model 20th November 2025 Hot on the heels of Tuesday’s Gemini 3 ...

🔍 View Similar Articles

https://news.ycombinator.com/item?id=45427634

news.ycombinator.com 2025-10-11

news.ycombinator.com

> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...

🔍 View Similar Articles

https://lightcapai.medium.com/same-ai-different-answer-how-tiny-prompts-can-change-everything-83e880f9773f

lightcapai.medium.com 2025-08-13

lightcapai.medium.com blog article +1

Same AI, Different Answer: How Tiny Prompts Can Change Everything Why Does ChatGPT Sometimes Feel Different? If you’ve used AI chatbots like ChatGPT f...

🔍 View Similar Articles 🟠 HN

https://www.ben-evans.com/benedictevans/2026/2/19/how-will-openai-compete-nkg2x

www.ben-evans.com 2026-03-05

www.ben-evans.com

How will OpenAI compete? “Jakub and Mark set the research direction for the long run. Then after months of work, something incredible emerges and I ge...

🔍 View Similar Articles 🟠 HN