Similar Articles

Articles similar to the selected content.

Domain: darkcoding.net Added: 2025-09-01 Status: βœ“ Success
darkcoding.net
Evaluating LLMs for my personal use case Summary It’s great that AI can win maths Olympiads, but that’s not what I’m doing. I mostly ask basic Rust, Python, Linux and life questions. So I did my own e...
Similar Articles (10 found)
πŸ” 69.8% similar
Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult
https://simonwillison.net/2025/Nov/24/claude-opus/#atom-entries
Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult 24th November 2025 Anthropic released Claude Opus 4.5 this morning, which they ...
πŸ” View Similar Articles
πŸ” 69.7% similar
Error extracting title
https://simonwillison.net/2024/Dec/31/llms-in-2024/
Things we learned about LLMs in 2024 31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a rev...
πŸ” View Similar Articles 🟠 HN
πŸ” 68.5% similar
GPT-5: Key characteristics, pricing and system card
https://simonwillison.net/2025/Aug/7/gpt-5/
GPT-5: Key characteristics, pricing and model card 7th August 2025 I’ve had preview access to the new GPT-5 model family for the past two weeks (see r...
πŸ” View Similar Articles 🟠 HN
πŸ” 68.2% similar
2 Years of ML vs. 1 Month of Prompting
https://www.levs.fyi/blog/2-years-of-ml-vs-1-month-of-prompting/
2 Years of ML vs. 1 Month of Prompting November 7, 2025 Recalls at major automakers cost hundreds of millions of dollars a year. It’s a huge issue. To...
πŸ” View Similar Articles 🟠 HN
πŸ” 66.7% similar
GPT-5.2
https://simonwillison.net/2025/Dec/11/gpt-52/#atom-entries
GPT-5.2 11th December 2025 OpenAI reportedly declared a β€œcode red” on the 1st of December in response to increasingly credible competition from the li...
πŸ” View Similar Articles
πŸ” 65.5% similar
Extract-0: A specialized language model for document information extraction
https://news.ycombinator.com/item?id=45427634
> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...
πŸ” View Similar Articles
πŸ” 64.9% similar
So you wanna build a local RAG?
https://blog.yakkomajuri.com/blog/local-rag
When we launched Skald, we wanted it to not only be self-hostable, but also for one to be able to run it without sending any data to third-parties. Wi...
πŸ” View Similar Articles 🟠 HN
πŸ” 64.9% similar
Olmo 3 is a fully open LLM
https://simonwillison.net/2025/Nov/22/olmo-3/#atom-entries
Olmo 3 is a fully open LLM 22nd November 2025 Olmo is the LLM series from Ai2β€”the Allen institute for AI. Unlike most open weight models these are not...
πŸ” View Similar Articles
πŸ” 64.5% similar
Same AI, Different Answer: How Tiny Prompts Can Change Everything
https://lightcapai.medium.com/same-ai-different-answer-how-tiny-prompts-can-change-everything-83e880f9773f
Same AI, Different Answer: How Tiny Prompts Can Change Everything Why Does ChatGPT Sometimes Feel Different? If you’ve used AI chatbots like ChatGPT f...
πŸ” View Similar Articles 🟠 HN
πŸ” 64.3% similar
Vibe Coding as a Coding Veteran
https://levelup.gitconnected.com/vibe-coding-as-a-coding-veteran-cd370fe2be50
Vibe Coding as a Coding Veteran From 8-bit Assembly to English-as-Code By now, we’ve all heard about this β€œvibe coding” thing: you let an AI assistant...
πŸ” View Similar Articles 🟠 HN