Table of Contents
Fine Tuning SmolVLM for Human Alignment Using Direct Preference Optimization
Preference optimization shines when we want models to make choices that feel naturally human, not just sy...
Similar Articles (10 found)
π 62.7% similar
Deep Reinforcement Learning: Pong from Pixels
This is a long overdue blog post on Reinforcement Learning (RL). RL is hot! You may have noticed that co...
π 60.1% similar
Dynamic Programming in Reinforcement Learning
Our First Approach to Solving Reinforcement Learning Problems!
If youβre not familiar with the Bellman e...
π 57.4% similar
Through my work building XGBoost models across different projects, I came across the great resource Effective XGBoost by Matt Harrison, a textbook cov...
π 54.7% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...
π 53.6% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
π 52.4% similar
2 Years of ML vs. 1 Month of Prompting
November 7, 2025
Recalls at major automakers cost hundreds of millions of dollars a year. Itβs a huge issue. To...
π 51.5% similar
Table of Contents
- SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA
- SmolVLM 1: A Compact Yet Capable Vision-Language Model
- What Is SmolVLM...
π 50.8% similar
Table of Contents
- Breaking the CNN Mold: YOLOv12 Brings Attention to Real-Time Object Detection
- The YOLO Evolution (Quick Recap)
- YOLOv8: Introdu...
π 50.2% similar
Individual efficiency vs administrative efficiency
Everyone has their own favorite note-taking app: Notion versus Google Docs versus Apple Notes versu...
π 49.9% similar
The Bitter Lesson is Misunderstood
Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...