Deep Reinforcement Learning: Pong from Pixels
This is a long overdue blog post on Reinforcement Learning (RL). RL is hot! You may have noticed that computers can now automatically learn to play ATARI ...
Similar Articles (10 found)
π 70.8% similar
Dynamic Programming in Reinforcement Learning
Our First Approach to Solving Reinforcement Learning Problems!
If youβre not familiar with the Bellman e...
π 66.2% similar
Understanding rewards by teaching a robot to navigate a maze
One of the biggest barriers to traditional machine learning is that most supervised and u...
π 62.7% similar
Table of Contents
Fine Tuning SmolVLM for Human Alignment Using Direct Preference Optimization
Preference optimization shines when we want models to m...
π 62.7% similar
AlphaGo, in context
Update Oct 18, 2017: AlphaGo Zero was announced. This post refers to the previous version. 95% of it still applies.
I had a chance...
π 61.8% similar
First, thanks to the publisher and authors for making this freely available!
I retired recently after using neural networks since the 1980s. I still s...
π 59.8% similar
This article doesn't talk much about testing or getting training data. It seems like that part is key.
For code that you think you understand, it's be...
π 58.9% similar
Yes you should understand backprop
When we offered CS231n (Deep Learning class) at Stanford, we intentionally designed the programming assignments to ...
π 58.9% similar
Yes you should understand backprop
When we offered CS231n (Deep Learning class) at Stanford, we intentionally designed the programming assignments to ...
π 58.8% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
π 58.5% similar
I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation.
I've worked on language models since 2018...