Get the latest tech news

How fast is N tokens per second really?


Every local-LLM benchmark reports throughput: "47 tok/s on an M3," "180 tok/s on a 4090," "500 tok/s on Groq." Unless you've actually watched tokens stream at those rates, the numbers are hard to internalize. This is the rendering.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of N tokens

N tokens