Similar Articles

Fine Tuning SmolVLM for Human Alignment Using Direct Preference Optimization

https://pyimagesearch.com/2025/08/04/fine-tuning-smolvlm-for-human-alignment-using-direct-preference-optimization/

Domain: pyimagesearch.com Added: 2025-08-13 Status: ✓ Success

pyimagesearch.com computer-vision opencv tutorial

Table of Contents Fine Tuning SmolVLM for Human Alignment Using Direct Preference Optimization Preference optimization shines when we want models to make choices that feel naturally human, not just sy...

Similar Articles (10 found)

http://karpathy.github.io/2016/05/31/rl/

karpathy.github.io 2025-09-01

karpathy.github.io

Deep Reinforcement Learning: Pong from Pixels This is a long overdue blog post on Reinforcement Learning (RL). RL is hot! You may have noticed that co...

🔍 View Similar Articles 🟠 HN

https://pub.towardsai.net/dynamic-programming-in-reinforcement-learning-b94d7b3db22b?source=rss----98111c9905da---4

pub.towardsai.net 2025-08-13

pub.towardsai.net

Dynamic Programming in Reinforcement Learning Our First Approach to Solving Reinforcement Learning Problems! If you’re not familiar with the Bellman e...

🔍 View Similar Articles

https://towardsdatascience.com/marginal-effect-of-hyperparameter-tuning-with-xgboost/

towardsdatascience.com 2025-09-01

towardsdatascience.com

Through my work building XGBoost models across different projects, I came across the great resource Effective XGBoost by Matt Harrison, a textbook cov...

🔍 View Similar Articles

https://news.ycombinator.com/item?id=45427634

news.ycombinator.com 2025-10-11

news.ycombinator.com

> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...

🔍 View Similar Articles

https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm

www.gilesthomas.com 2025-11-08

www.gilesthomas.com

Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...

🔍 View Similar Articles 🟠 HN

🔍 52.4% similar

2 Years of ML vs. 1 Month of Prompting

https://www.levs.fyi/blog/2-years-of-ml-vs-1-month-of-prompting/

www.levs.fyi 2025-11-15

www.levs.fyi

2 Years of ML vs. 1 Month of Prompting November 7, 2025 Recalls at major automakers cost hundreds of millions of dollars a year. It’s a huge issue. To...

🔍 View Similar Articles 🟠 HN

https://pyimagesearch.com/2025/06/23/smolvlm-to-smolvlm2-compact-models-for-multi-image-vqa/

pyimagesearch.com 2025-08-13

pyimagesearch.com computer-vision opencv +1

Table of Contents - SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA - SmolVLM 1: A Compact Yet Capable Vision-Language Model - What Is SmolVLM...

🔍 View Similar Articles

https://pyimagesearch.com/2025/07/07/breaking-the-cnn-mold-yolov12-brings-attention-to-real-time-object-detection/

pyimagesearch.com 2025-08-13

pyimagesearch.com computer-vision opencv +1

Table of Contents - Breaking the CNN Mold: YOLOv12 Brings Attention to Real-Time Object Detection - The YOLO Evolution (Quick Recap) - YOLOv8: Introdu...

🔍 View Similar Articles

https://longform.asmartbear.com/tension-autonomy-admin/

longform.asmartbear.com 2025-08-13

longform.asmartbear.com

Individual efficiency vs administrative efficiency Everyone has their own favorite note-taking app: Notion versus Google Docs versus Apple Notes versu...

🔍 View Similar Articles

🔍 49.9% similar

The Bitter Lesson is Misunderstood

https://obviouslywrong.substack.com/p/the-bitter-lesson-is-misunderstood

obviouslywrong.substack.com 2025-09-04

obviouslywrong.substack.com

The Bitter Lesson is Misunderstood Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...

🔍 View Similar Articles 🟠 HN