Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090
Having worked through the main body of Sebastian Raschka's book "Build a Large Language Model (from Scratch)",...
Similar Articles (10 found)
π 73.3% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
π 64.1% similar
One announcement that caught my eye in particular occurred at the end of July, when Google released a new text processing and data extraction tool cal...
π 63.5% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...
π 61.7% similar
How We Cut Inference Costs from $46K to $7.5K Fine-Tuning Qwen-Image-Edit
Running quality inference at scale is something we think about a lot at Oxen...
π 61.3% similar
Building a web search engine from scratch in two months with 3 billion neural embeddings
A while back, I decided to undertake a project to challenge m...
π 61.1% similar
There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at
https://news.ycombinator.com/item?id=37090632)...
π 59.6% similar
Things we learned about LLMs in 2024
31st December 2024
A lot has happened in the world of Large Language Models over the course of 2024. Hereβs a rev...
π 59.1% similar
GPT-5: Key characteristics, pricing and model card
7th August 2025
Iβve had preview access to the new GPT-5 model family for the past two weeks (see r...
π 59.1% similar
Deep Neural Nets: 33 years ago and 33 years from now
The Yann LeCun et al. (1989) paper Backpropagation Applied to Handwritten Zip Code Recognition is...
π 58.3% similar
Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult
24th November 2025
Anthropic released Claude Opus 4.5 this morning, which they ...