Get the latest tech news

Inference cost at scale with napkin math


you serve AI models as a part of your product stack, you've likely wondered what kind of scale your GPU cluster tops out at. With some rudimentary knowledge about your hardware and model architecture, we can work out the dollar cost-per-user on the back of a napkin1.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of cost

cost

Photo of inference

inference

Photo of scale

scale

Related news:

News photo

Citrix now lets you run virtual desktops like a cost-conscious private equityeer

News photo

The AI tipping point: where enterprise AI runs at scale

News photo

SanDisk's officially licensed PlayStation 5 SSDs cost more than the console itself - even a PS5 Pro