Evaluating LLMs for my personal use case
Summary
Itβs great that AI can win maths Olympiads, but thatβs not what Iβm doing. I mostly ask basic Rust, Python, Linux and life questions. So I did my own e...
Similar Articles (10 found)
π 69.7% similar
Things we learned about LLMs in 2024
31st December 2024
A lot has happened in the world of Large Language Models over the course of 2024. Hereβs a rev...
π 68.5% similar
GPT-5: Key characteristics, pricing and model card
7th August 2025
Iβve had preview access to the new GPT-5 model family for the past two weeks (see r...
π 68.2% similar
2 Years of ML vs. 1 Month of Prompting
November 7, 2025
Recalls at major automakers cost hundreds of millions of dollars a year. Itβs a huge issue. To...
π 65.5% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...
π 64.9% similar
When we launched Skald, we wanted it to not only be self-hostable, but also for one to be able to run it without sending any data to third-parties.
Wi...
π 64.5% similar
Same AI, Different Answer: How Tiny Prompts Can Change Everything
Why Does ChatGPT Sometimes Feel Different?
If youβve used AI chatbots like ChatGPT f...
π 64.3% similar
Vibe Coding as a Coding Veteran
From 8-bit Assembly to English-as-Code
By now, weβve all heard about this βvibe codingβ thing: you let an AI assistant...
π 63.9% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
π 61.6% similar
How We Cut Inference Costs from $46K to $7.5K Fine-Tuning Qwen-Image-Edit
Running quality inference at scale is something we think about a lot at Oxen...
π 60.7% similar
Each month, this newsletter is read by over 45K+ operators, investors, and tech / product leaders and executives. If you found value in this newslette...