Get the latest tech news

Inference cost at scale with napkin math

you serve AI models as a part of your product stack, you've likely wondered what kind of scale your GPU cluster tops out at. With some rudimentary knowledge about your hardware and model architecture, we can work out the dollar cost-per-user on the back of a napkin1.

None

Get the Android app

Or read this on Hacker News

Related news:

Citrix now lets you run virtual desktops like a cost-conscious private equityeer

The AI tipping point: where enterprise AI runs at scale

SanDisk's officially licensed PlayStation 5 SSDs cost more than the console itself - even a PS5 Pro

« UK Home Office launches £75M 'PoliceAI' to capitalise on artificial intelligence

Microsoft confirms Recycle Bin glitch affecting all supported Windows versions — yes, even the trash needs debugging »