Similar Articles

Articles similar to the selected content.

Domain: qlabs.sh Added: 2026-03-05 Status: βœ“ Success
qlabs.sh
Language Modeling with Limited Data, Infinite Compute March 2026 NanoGPT Slowrun is an open effort to implement data-efficient learning algorithms; 5.5x data efficiency in the first week and improving...
Similar Articles (10 found)
πŸ” 68.7% similar
https://openai.com/index/techniques-for-training-large-neural-networks/
https://openai.com/index/techniques-for-training-large-neural-networks/
Techniques for training large neural networks Large neural networks are at the core of many recent advances in AI, but training them is a difficult en...
πŸ” View Similar Articles
πŸ” 67.3% similar
Extract-0: A specialized language model for document information extraction
https://news.ycombinator.com/item?id=45427634
> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...
πŸ” View Similar Articles
πŸ” 67.0% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm
Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
πŸ” View Similar Articles 🟠 HN
πŸ” 66.8% similar
Fine-tune your own Llama 2 to replace GPT-3.5/4
https://news.ycombinator.com/item?id=37484135
There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at https://news.ycombinator.com/item?id=37090632)...
πŸ” View Similar Articles
πŸ” 66.8% similar
How we run GPT OSS 120B at 500+ tokens per second on NVIDIA GPUs | Baseten Blog
https://www.baseten.co/blog/sota-performance-for-gpt-oss-120b-on-nvidia-gpus/
Day zero model performance optimization work is a mix of experimentation, bug fixing, and benchmarking guided by intuition and experience. This writeu...
πŸ” View Similar Articles 🟠 HN
πŸ” 65.2% similar
MicroGPT explained interactively
https://growingswe.com/blog/microgpt
MicroGPT explained interactively Andrej Karpathy wrote a 200-line Python script that trains and runs a GPT from scratch, with no libraries or dependen...
πŸ” View Similar Articles 🟠 HN
πŸ” 64.8% similar
Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?
https://news.ycombinator.com/item?id=44840728
Sam said yesterday that chatgpt handles ~700M weekly users. Meanwhile, I can't even run a single GPT-4-class model locally without insane VRAM or pain...
πŸ” View Similar Articles
πŸ” 63.6% similar
Error extracting title
https://simonwillison.net/2024/Dec/31/llms-in-2024/
Things we learned about LLMs in 2024 31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a rev...
πŸ” View Similar Articles 🟠 HN
πŸ” 63.3% similar
The Bitter Lesson is Misunderstood
https://obviouslywrong.substack.com/p/the-bitter-lesson-is-misunderstood
The Bitter Lesson is Misunderstood Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...
πŸ” View Similar Articles 🟠 HN
πŸ” 63.2% similar
Understanding LLM Inference Engines: Inside Nano-vLLM (Part 1) - Neutree Blog
https://neutree.ai/blog/nano-vllm-part-1
Understanding LLM Inference Engines: Inside Nano-vLLM (Part 1) Architecture, Scheduling, and the Path from Prompt to Token When deploying large langua...
πŸ” View Similar Articles 🟠 HN