Similar Articles

Why DeepSeek is cheap at scale but expensive to run locally

https://www.seangoedecke.com/inference-batching-and-deepseek/

Domain: www.seangoedecke.com Added: 2025-07-13 Status: ✓ Success

deepseek,ai models,throughput,latency,batch size,www.seangoedecke.com

Why DeepSeek is cheap at scale but expensive to run locally Why is DeepSeek-V3 supposedly fast and cheap to serve at scale, but too slow and expensive to run locally? Why are some AI models slow to re...

Similar Articles (10 found)

openai.com 2025-07-13

openai.com

Techniques for training large neural networks Large neural networks are at the core of many recent advances in AI, but training them is a difficult en...

🔍 View Similar Articles

https://neutree.ai/blog/nano-vllm-part-1

neutree.ai 2026-02-03

neutree.ai

Understanding LLM Inference Engines: Inside Nano-vLLM (Part 1) Architecture, Scheduling, and the Path from Prompt to Token When deploying large langua...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=32641769

news.ycombinator.com 2025-07-13

hackernews,tech,news,news.ycombinator.com

For some reason they focus on the inference, which is the computationally cheap part. If you're working on ML (as opposed to deploying someone else's ...

🔍 View Similar Articles

🔍 66.8% similar

LLM Engineer's Almanac - Workloads

https://modal.com/llm-almanac/workloads

modal.com 2026-02-03

modal.com

The three types of LLM workloads and how to serve them We hold this truth to be self-evident: not all workloads are created equal. But for large langu...

🔍 View Similar Articles 🟠 HN

🔍 66.5% similar

The Inference Economy

https://frontierai.substack.com/p/the-inference-economy

frontierai.substack.com 2026-03-05

frontierai.substack.com

The Inference Economy What data center build outs tell us about intelligence costs Trillion dollar data center buildouts are all the rage. Discussions...

🔍 View Similar Articles

https://news.ycombinator.com/item?id=45427634

news.ycombinator.com 2025-10-11

news.ycombinator.com

> the generation of 281,128 augmented examples, from which 1,000 were held out as a benchmark test set. This model is trained on a custom dataset of 2...

🔍 View Similar Articles

https://www.gilesthomas.com/2025/10/llm-from-scratch-22-finally-training-our-llm

www.gilesthomas.com 2025-11-08

www.gilesthomas.com

Writing an LLM from scratch, part 22 -- finally training our LLM! This post wraps up my notes on chapter 5 of Sebastian Raschka's book "Build a Large ...

🔍 View Similar Articles 🟠 HN

🔍 65.9% similar

The Bitter Lesson is Misunderstood

https://obviouslywrong.substack.com/p/the-bitter-lesson-is-misunderstood

obviouslywrong.substack.com 2025-09-04

obviouslywrong.substack.com

The Bitter Lesson is Misunderstood Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...

🔍 View Similar Articles 🟠 HN

https://news.ycombinator.com/item?id=44840728

news.ycombinator.com 2025-08-13

news.ycombinator.com

Sam said yesterday that chatgpt handles ~700M weekly users. Meanwhile, I can't even run a single GPT-4-class model locally without insane VRAM or pain...

🔍 View Similar Articles

https://martinalderson.com/posts/are-openai-and-anthropic-really-losing-money-on-inference/

martinalderson.com 2025-08-29

martinalderson.com

Are OpenAI and Anthropic Really Losing Money on Inference? I keep hearing what a cash incinerator AI is, especially around inference. While it seems r...

🔍 View Similar Articles 🟠 HN