For some reason they focus on the inference, which is the computationally cheap part. If you're working on ML (as opposed to deploying someone else's ML) then almost all of your workload is training, ...
Similar Articles (10 found)
🔍 74.5% similar
> the generation of 281,128 augmented examples, from which 1,000 were
held out as a benchmark test set.
This model is trained on a custom dataset of 2...
🔍 72.5% similar
First, thanks to the publisher and authors for making this freely available!
I retired recently after using neural networks since the 1980s. I still s...
🔍 72.4% similar
I'm curious why we seem convinced that this is a task that is possible or something worthy of investigation.
I've worked on language models since 2018...
🔍 70.7% similar
Are OpenAI and Anthropic Really Losing Money on Inference?
I keep hearing what a cash incinerator AI is, especially around inference. While it seems r...
🔍 69.8% similar
What is a good algorithm-to-purpose map for ML beginners? Looking for something like "Algo X is good for making predictions when your data looks like ...
🔍 69.1% similar
What happens when coding agents stop feeling like dialup?
It's funny how quickly humans adjust to new technology. Only a few months ago Claude Code an...
🔍 68.8% similar
Techniques for training large neural networks
Large neural networks are at the core of many recent advances in AI, but training them is a difficult en...
🔍 68.8% similar
Why DeepSeek is cheap at scale but expensive to run locally
Why is DeepSeek-V3 supposedly fast and cheap to serve at scale, but too slow and expensive...
🔍 67.9% similar
The Bitter Lesson is Misunderstood
Together, the Bitter Lesson and Scaling Laws reveal that the god of Compute we worship is yoked to an even greater ...
🔍 66.3% similar
Sam said yesterday that chatgpt handles ~700M weekly users. Meanwhile, I can't even run a single GPT-4-class model locally without insane VRAM or pain...