Get the latest tech news

DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI


DeepSeek's free 685B-parameter AI model runs at 20 tokens/second on Apple's Mac Studio, outperforming Claude Sonnet while using just 200 watts, challenging OpenAI's cloud-dependent business model.

While the $9,499 Mac Studio might stretch the definition of “consumer hardware,” the ability to run such a massive model locally is a major departure from the data center requirements typically associated with state-of-the-art AI. Simon Willison, a developer tools creator, noted in a blog post that a 4-bit quantized version reduces the storage footprint to 352GB, making it feasible to run on high-end consumer hardware like the Mac Studio with M3 Ultra chip. Nvidia CEO Jensen Huang recently noted that DeepSeek’s R1 model “ consumes 100 times more compute than a non-reasoning AI,” contradicting earlier industry assumptions about efficiency.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of OpenAI

OpenAI

Photo of Nightmare

Nightmare

Photo of mac studio

mac studio

Related news:

News photo

OpenAI Expands COO’s Role as Altman Focuses on Research and Products

News photo

OpenAI’s Sora Is Plagued by Sexist, Racist, and Ableist Biases

News photo

OpenAI, Meta Seek Alliance With India’s Reliance: Information