Similar Articles

Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090

https://www.gilesthomas.com/2025/12/llm-from-scratch-28-training-a-base-model-from-scratch

Domain: www.gilesthomas.com Added: 2025-12-14 Status: ✓ Success

www.gilesthomas.com

Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 Having worked through the main body of Sebastian Raschka's book "Build a Large Language Model (from Scratch)",...

Similar Articles (10 found)

https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm

www.gilesthomas.com 2025-11-08

www.gilesthomas.com

Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...

🔍 View Similar Articles 🟠 HN

🔍 64.1% similar

Introducing Google’s LangExtract tool

https://towardsdatascience.com/introducing-googles-langextract-tool-2/

towardsdatascience.com 2025-08-13

towardsdatascience.com

One announcement that caught my eye in particular occurred at the end of July, when Google released a new text processing and data extraction tool cal...

🔍 View Similar Articles

https://news.ycombinator.com/item?id=45427634

news.ycombinator.com 2025-10-11

news.ycombinator.com

> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...

🔍 View Similar Articles

https://ghost.oxen.ai/how-we-cut-inference-costs-from-46k-to-7-5k-fine-tuning-qwen-image-edit/

ghost.oxen.ai 2025-10-26

ghost.oxen.ai

How We Cut Inference Costs from $46K to $7.5K Fine-Tuning Qwen-Image-Edit Running quality inference at scale is something we think about a lot at Oxen...

🔍 View Similar Articles

https://blog.wilsonl.in/search-engine/

blog.wilsonl.in 2025-08-13

blog.wilsonl.in

Building a web search engine from scratch in two months with 3 billion neural embeddings A while back, I decided to undertake a project to challenge m...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=37484135

news.ycombinator.com 2025-07-13

open-source,gpt-4,tech,hackernews,llms,news,news.ycombinator.com,machine learning

There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at https://news.ycombinator.com/item?id=37090632)...

🔍 View Similar Articles

🔍 59.6% similar

Error extracting title

https://simonwillison.net/2024/Dec/31/llms-in-2024/

simonwillison.net 2025-07-12

simonwillison.net

Things we learned about LLMs in 2024 31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a rev...

🔍 View Similar Articles 🟠 HN

https://simonwillison.net/2025/Aug/7/gpt-5/

simonwillison.net 2025-08-13

simonwillison.net

GPT-5: Key characteristics, pricing and model card 7th August 2025 I’ve had preview access to the new GPT-5 model family for the past two weeks (see r...

🔍 View Similar Articles 🟠 HN

http://karpathy.github.io/2022/03/14/lecun1989/

karpathy.github.io 2025-09-01

karpathy.github.io

Deep Neural Nets: 33 years ago and 33 years from now The Yann LeCun et al. (1989) paper Backpropagation Applied to Handwritten Zip Code Recognition is...

🔍 View Similar Articles 🟠 HN

https://simonwillison.net/2025/Nov/24/claude-opus/#atom-entries

simonwillison.net 2025-12-18

simonwillison.net

Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult 24th November 2025 Anthropic released Claude Opus 4.5 this morning, which they ...

🔍 View Similar Articles