Get the latest tech news

DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI

DeepSeek's free 685B-parameter AI model runs at 20 tokens/second on Apple's Mac Studio, outperforming Claude Sonnet while using just 200 watts, challenging OpenAI's cloud-dependent business model.

While the $9,499 Mac Studio might stretch the definition of “consumer hardware,” the ability to run such a massive model locally is a major departure from the data center requirements typically associated with state-of-the-art AI. Simon Willison, a developer tools creator, noted in a blog post that a 4-bit quantized version reduces the storage footprint to 352GB, making it feasible to run on high-end consumer hardware like the Mac Studio with M3 Ultra chip. Nvidia CEO Jensen Huang recently noted that DeepSeek’s R1 model “ consumes 100 times more compute than a non-reasoning AI,” contradicting earlier industry assumptions about efficiency.

Get the Android app

Or read this on Venture Beat