Get the latest tech news

Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost


The company developed DeepSeek-R1 by using pure reinforcement learning on top of DeepSeek-V3-Base, and matched or beat o1 on some benchmarks.

DeepSeek-R1’s reasoning performance marks a big win for the Chinese startup in the US-dominated AI space, especially as the entire work is open-source, including how the company trained the whole thing. The company first used DeepSeek-V3-base as the base model, developing its reasoning capabilities without employing supervised data, essentially focusing only on its self-evolution through a pure RL-based trial-and-error process. Developed intrinsically from the work, this ability ensures the model can solve increasingly complex reasoning tasks by leveraging extended test-time computation to explore and refine its thought processes in greater depth.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of OpenAI

OpenAI

Photo of cost

cost

Photo of OpenAI O1

OpenAI O1

Related news:

News photo

DeepSeek claims its ‘reasoning’ model beats OpenAI’s o1 on certain benchmarks

News photo

Will GTA 6 cost £80? Report says there's industry "hope" Rockstar will usher in another game price rise

News photo

FrontierMath was funded by OpenAI