Similar Articles

Articles similar to the selected content.

Domain: news.ycombinator.com Added: 2025-07-12 Status: βœ“ Success
news,tech,hackernews,news.ycombinator.com
This article doesn't talk much about testing or getting training data. It seems like that part is key. For code that you think you understand, it's because you've informally proven to yourself that it...
Similar Articles (10 found)
πŸ” 74.3% similar
The Principles of Deep Learning Theory (arxiv.org)
https://news.ycombinator.com/item?id=31051540
First, thanks to the publisher and authors for making this freely available! I retired recently after using neural networks since the 1980s. I still s...
πŸ” View Similar Articles
πŸ” 68.0% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm
Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
πŸ” View Similar Articles 🟠 HN
πŸ” 66.2% similar
A Brief History of GPT Through Papers
https://towardsdatascience.com/a-brief-history-of-gpt-through-papers/
0) Prologue: The Turing test In October 1950, Alan Turing proposed a test. Was it possible to have a conversation with a machine and not be able to te...
πŸ” View Similar Articles
πŸ” 64.6% similar
Deep Neural Nets: 33 years ago and 33 years from now
http://karpathy.github.io/2022/03/14/lecun1989/
Deep Neural Nets: 33 years ago and 33 years from now The Yann LeCun et al. (1989) paper Backpropagation Applied to Handwritten Zip Code Recognition is...
πŸ” View Similar Articles 🟠 HN
πŸ” 64.2% similar
Extract-0: A specialized language model for document information extraction
https://news.ycombinator.com/item?id=45427634
> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...
πŸ” View Similar Articles
πŸ” 63.9% similar
TimesFM: Time Series Foundation Model for time-series forecasting (github.com/google-research)
https://news.ycombinator.com/item?id=40297946
I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation. I've worked on language models since 2018...
πŸ” View Similar Articles
πŸ” 63.2% similar
https://openai.com/index/techniques-for-training-large-neural-networks/
https://openai.com/index/techniques-for-training-large-neural-networks/
Techniques for training large neural networks Large neural networks are at the core of many recent advances in AI, but training them is a difficult en...
πŸ” View Similar Articles
πŸ” 63.2% similar
A Recipe for Training Neural Networks
http://karpathy.github.io/2019/04/25/recipe/
A Recipe for Training Neural Networks Some few weeks ago I posted a tweet on β€œthe most common neural net mistakes”, listing a few common gotchas relat...
πŸ” View Similar Articles 🟠 HN
πŸ” 62.4% similar
Yes you should understand backprop
https://karpathy.medium.com/yes-you-should-understand-backprop-e2f06eab496b?source=rss-ac9d9a35533e------2
Yes you should understand backprop When we offered CS231n (Deep Learning class) at Stanford, we intentionally designed the programming assignments to ...
πŸ” View Similar Articles
πŸ” 62.4% similar
Yes you should understand backprop
https://karpathy.medium.com/yes-you-should-understand-backprop-e2f06eab496b
Yes you should understand backprop When we offered CS231n (Deep Learning class) at Stanford, we intentionally designed the programming assignments to ...
πŸ” View Similar Articles 🟠 HN