Similar Articles

Writing an LLM from scratch, part 22 -- finally training our LLM!

https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm

Domain: www.gilesthomas.com Added: 2025-11-08 Status: ✓ Success

www.gilesthomas.com

Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large Language Model (from Scratch)". Understanding cros...

Similar Articles (10 found)

🔍 77.3% similar

MicroGPT explained interactively

https://growingswe.com/blog/microgpt

growingswe.com 2026-03-01

growingswe.com

MicroGPT explained interactively Andrej Karpathy wrote a 200-line Python script that trains and runs a GPT from scratch, with no libraries or dependen...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=45427634

news.ycombinator.com 2025-10-11

news.ycombinator.com

> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...

🔍 View Similar Articles

https://www.gilesthomas.com/2025/12/llm-from-scratch-28-training-a-base-model-from-scratch

www.gilesthomas.com 2025-12-14

www.gilesthomas.com

Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 Having worked through the main body of Sebastian Raschka's b...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=37484135

news.ycombinator.com 2025-07-13

open-source,gpt-4,tech,hackernews,llms,news,news.ycombinator.com,machine learning

There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at https://news.ycombinator.com/item?id=37090632)...

🔍 View Similar Articles

https://news.ycombinator.com/item?id=31051540

news.ycombinator.com 2025-07-13

hackernews,tech,news,news.ycombinator.com

First, thanks to the publisher and authors for making this freely available! I retired recently after using neural networks since the 1980s. I still s...

🔍 View Similar Articles

https://simonwillison.net/2025/Aug/7/gpt-5/

simonwillison.net 2025-08-13

simonwillison.net

GPT-5: Key characteristics, pricing and model card 7th August 2025 I’ve had preview access to the new GPT-5 model family for the past two weeks (see r...

🔍 View Similar Articles 🟠 HN

🔍 68.8% similar

Error extracting title

https://simonwillison.net/2024/Dec/31/llms-in-2024/

simonwillison.net 2025-07-12

simonwillison.net

Things we learned about LLMs in 2024 31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a rev...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=40297946

news.ycombinator.com 2025-07-13

news.ycombinator.com,hackernews,tech,news

I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation. I've worked on language models since 2018...

🔍 View Similar Articles

https://news.ycombinator.com/item?id=40845304

news.ycombinator.com 2025-07-12

news,tech,hackernews,news.ycombinator.com

This article doesn't talk much about testing or getting training data. It seems like that part is key. For code that you think you understand, it's be...

🔍 View Similar Articles

🔍 67.2% similar

The Bitter Lesson is Misunderstood

https://obviouslywrong.substack.com/p/the-bitter-lesson-is-misunderstood

obviouslywrong.substack.com 2025-09-04

obviouslywrong.substack.com

The Bitter Lesson is Misunderstood Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...

🔍 View Similar Articles 🟠 HN