Dynamic Programming in Reinforcement Learning
Our First Approach to Solving Reinforcement Learning Problems!
If youβre not familiar with the Bellman equations, make sure to check this first: Why Is th...
Similar Articles (10 found)
π 70.8% similar
Deep Reinforcement Learning: Pong from Pixels
This is a long overdue blog post on Reinforcement Learning (RL). RL is hot! You may have noticed that co...
π 62.8% similar
Understanding rewards by teaching a robot to navigate a maze
One of the biggest barriers to traditional machine learning is that most supervised and u...
π 60.1% similar
Table of Contents
Fine Tuning SmolVLM for Human Alignment Using Direct Preference Optimization
Preference optimization shines when we want models to m...
π 55.3% similar
Quick Summary
In this article we will:
- Cover the basic ideas.
- Code up a solver in Python.
- Play with a simple linear system: the double integrato...
π 54.0% similar
First, thanks to the publisher and authors for making this freely available!
I retired recently after using neural networks since the 1980s. I still s...
π 53.5% similar
The edge is back. This time, it speaks.
Letβs be honest.
Talking to ChatGPT is fun.
But do you really want to send your "lock my screen" or "write a n...
π 53.4% similar
This article doesn't talk much about testing or getting training data. It seems like that part is key.
For code that you think you understand, it's be...
π 53.1% similar
I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation.
I've worked on language models since 2018...
π 52.8% similar
There were many courses, books and resources I used along the way that helped me, but being honest, many of them I wouldnβt have taken in hindsight.
S...
π 52.7% similar
The Bitter Lesson is Misunderstood
Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...