Similar Articles

Articles similar to the selected content.

Domain: towardsdatascience.com Added: 2025-08-13 Status: βœ“ Success
towardsdatascience.com
Topic modeling remains a critical tool in the AI and NLP toolbox. While large language models (LLMs) handle text exceptionally well, extracting high-level topics from massive datasets still requires d...
Similar Articles (10 found)
πŸ” 62.2% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm
Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
πŸ” View Similar Articles 🟠 HN
πŸ” 60.2% similar
word2vec-style vector arithmetic on docs embeddingsΒ§
https://technicalwriting.dev/embeddings/arithmetic/index.html
word2vec-style vector arithmetic on docs embeddingsΒ§ 2025 October 29 word2vec popularized the idea of representing words as vectors where semantically...
πŸ” View Similar Articles 🟠 HN
πŸ” 59.7% similar
Extract-0: A specialized language model for document information extraction
https://news.ycombinator.com/item?id=45427634
> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...
πŸ” View Similar Articles
πŸ” 58.5% similar
Error extracting title
https://www.pinecone.io/learn/series/image-search/clip/?_hsenc=p2ANqtz-_MZUbziNKCoB2HdM3hBzmaHEesRF9TFZ-S2FkjdJPtOZ2z4GVwso8C-LuBAx8f1Ac7N3G2rnc19e3xHqfVE4zty3DNoQ&_hsmi=251366668&utm_content=251366668&utm_medium=email&utm_source=hs_automation
Multi-modal ML with OpenAI's CLIP Language models (LMs) can not rely on language alone. That is the idea behind the β€œExperience Grounds Language” pape...
πŸ” View Similar Articles
πŸ” 58.4% similar
MicroGPT explained interactively
https://growingswe.com/blog/microgpt
MicroGPT explained interactively Andrej Karpathy wrote a 200-line Python script that trains and runs a GPT from scratch, with no libraries or dependen...
πŸ” View Similar Articles 🟠 HN
πŸ” 58.2% similar
How big are our embeddings now and why?
https://vickiboykis.com/2025/09/01/how-big-are-our-embeddings-now-and-why/
How big are our embeddings now and why? #embeddings #openai #anthropic #huggingface #dimensionality A few years ago, I wrote a paper on embeddings. At...
πŸ” View Similar Articles 🟠 HN
πŸ” 57.4% similar
2 Years of ML vs. 1 Month of Prompting
https://www.levs.fyi/blog/2-years-of-ml-vs-1-month-of-prompting/
2 Years of ML vs. 1 Month of Prompting November 7, 2025 Recalls at major automakers cost hundreds of millions of dollars a year. It’s a huge issue. To...
πŸ” View Similar Articles 🟠 HN
πŸ” 56.7% similar
The Illustrated Word2vec
https://jalammar.github.io/illustrated-word2vec/
The Illustrated Word2vec Discussions: Hacker News (347 points, 37 comments), Reddit r/MachineLearning (151 points, 19 comments) Translations: Chinese ...
πŸ” View Similar Articles 🟠 HN
πŸ” 56.2% similar
Fine-tune your own Llama 2 to replace GPT-3.5/4
https://news.ycombinator.com/item?id=37484135
There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at https://news.ycombinator.com/item?id=37090632)...
πŸ” View Similar Articles
πŸ” 56.0% similar
Error extracting title
https://simonwillison.net/2024/Dec/31/llms-in-2024/
Things we learned about LLMs in 2024 31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a rev...
πŸ” View Similar Articles 🟠 HN