Similar Articles

Articles similar to the selected content.

Domain: technicalwriting.dev Added: 2025-11-03 Status: βœ“ Success
technicalwriting.dev
word2vec-style vector arithmetic on docs embeddingsΒ§ 2025 October 29 word2vec popularized the idea of representing words as vectors where semantically similar words are positioned close to each other ...
Similar Articles (10 found)
πŸ” 60.2% similar
Fine-Tune Your Topic Modeling Workflow with BERTopic
https://towardsdatascience.com/finetune-your-topic-modeling-workflow-with-bertopic/
Topic modeling remains a critical tool in the AI and NLP toolbox. While large language models (LLMs) handle text exceptionally well, extracting high-l...
πŸ” View Similar Articles
πŸ” 58.8% similar
Introducing Google’s LangExtract tool
https://towardsdatascience.com/introducing-googles-langextract-tool-2/
One announcement that caught my eye in particular occurred at the end of July, when Google released a new text processing and data extraction tool cal...
πŸ” View Similar Articles
πŸ” 57.8% similar
https://www.technologyreview.com/2015/09/17/166211/king-man-woman-queen-the-marvelous-mathematics-of-computational-linguistics/
https://www.technologyreview.com/2015/09/17/166211/king-man-woman-queen-the-marvelous-mathematics-of-computational-linguistics/
King – Man + Woman = Queen: The Marvelous Mathematics of Computational Linguistics Computational linguistics has dramatically changed the way research...
πŸ” View Similar Articles
πŸ” 56.2% similar
Writing an LLM from scratch, part 22 -- finally training our LLM!
https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm
Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...
πŸ” View Similar Articles 🟠 HN
πŸ” 55.5% similar
So you wanna build a local RAG?
https://blog.yakkomajuri.com/blog/local-rag
When we launched Skald, we wanted it to not only be self-hostable, but also for one to be able to run it without sending any data to third-parties. Wi...
πŸ” View Similar Articles 🟠 HN
πŸ” 54.8% similar
Extract-0: A specialized language model for document information extraction
https://news.ycombinator.com/item?id=45427634
> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...
πŸ” View Similar Articles
πŸ” 54.4% similar
Vector Databases: A Technical Primer [pdf] (digitaloceanspaces.com)
https://news.ycombinator.com/item?id=38971221
Thanks for writing this one Simon, I read it some time ago and I just wanted to say thanks and recommend it to folks browsing the comments, it's reall...
πŸ” View Similar Articles
πŸ” 54.3% similar
Unlocking Multimodal Video Transcription with Gemini
https://towardsdatascience.com/unlocking-multimodal-video-transcription-with-gemini/
A quick heads-up before we start: - I’m a developer at Google Cloud. I’m happy to share this article and hope you’ll learn a few things. Thoughts and ...
πŸ” View Similar Articles
πŸ” 54.1% similar
Show HN: Building a web search engine from scratch with 3B neural embeddings
https://blog.wilsonl.in/search-engine/
Building a web search engine from scratch in two months with 3 billion neural embeddings A while back, I decided to undertake a project to challenge m...
πŸ” View Similar Articles 🟠 HN
πŸ” 53.4% similar
LESSWRONG LW
https://www.lesswrong.com/posts/dxiConBZTd33sFaRC/field-notes-from-shipping-real-code-with-claude
Shimmering Substance - Jackson Pollock Think of this post as your field guide to a new way of building software. Let me take you back to when this all...
πŸ” View Similar Articles 🟠 HN