Similar Articles

Language Modeling with Limited Data, Infinite Compute

https://qlabs.sh/slowrun

Domain: qlabs.sh Added: 2026-03-05 Status: ✓ Success

qlabs.sh

Language Modeling with Limited Data, Infinite Compute March 2026 NanoGPT Slowrun is an open effort to implement data-efficient learning algorithms; 5.5x data efficiency in the first week and improving...

Similar Articles (10 found)

openai.com 2025-07-13

openai.com

Techniques for training large neural networks Large neural networks are at the core of many recent advances in AI, but training them is a difficult en...

🔍 View Similar Articles

https://news.ycombinator.com/item?id=45427634

news.ycombinator.com 2025-10-11

news.ycombinator.com

> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...

🔍 View Similar Articles

https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm

www.gilesthomas.com 2025-11-08

www.gilesthomas.com

Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=37484135

news.ycombinator.com 2025-07-13

open-source,gpt-4,tech,hackernews,llms,news,news.ycombinator.com,machine learning

There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at https://news.ycombinator.com/item?id=37090632)...

🔍 View Similar Articles

https://www.baseten.co/blog/sota-performance-for-gpt-oss-120b-on-nvidia-gpus/

www.baseten.co 2025-08-06

www.baseten.co,model performance optimization,bug fixing,nvidia gpus,experimentation,benchmarking

Day zero model performance optimization work is a mix of experimentation, bug fixing, and benchmarking guided by intuition and experience. This writeu...

🔍 View Similar Articles 🟠 HN

🔍 65.2% similar

MicroGPT explained interactively

https://growingswe.com/blog/microgpt

growingswe.com 2026-03-01

growingswe.com

MicroGPT explained interactively Andrej Karpathy wrote a 200-line Python script that trains and runs a GPT from scratch, with no libraries or dependen...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=44840728

news.ycombinator.com 2025-08-13

news.ycombinator.com

Sam said yesterday that chatgpt handles ~700M weekly users. Meanwhile, I can't even run a single GPT-4-class model locally without insane VRAM or pain...

🔍 View Similar Articles

🔍 63.6% similar

Error extracting title

https://simonwillison.net/2024/Dec/31/llms-in-2024/

simonwillison.net 2025-07-12

simonwillison.net

Things we learned about LLMs in 2024 31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a rev...

🔍 View Similar Articles 🟠 HN

🔍 63.3% similar

The Bitter Lesson is Misunderstood

https://obviouslywrong.substack.com/p/the-bitter-lesson-is-misunderstood

obviouslywrong.substack.com 2025-09-04

obviouslywrong.substack.com

The Bitter Lesson is Misunderstood Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...

🔍 View Similar Articles 🟠 HN

https://neutree.ai/blog/nano-vllm-part-1

neutree.ai 2026-02-03

neutree.ai

Understanding LLM Inference Engines: Inside Nano-vLLM (Part 1) Architecture, Scheduling, and the Path from Prompt to Token When deploying large langua...

🔍 View Similar Articles 🟠 HN