Get the latest tech news

Berkeley Researchers Replicate DeepSeek R1's Core Tech for Just $30: A Small Mod


A Berkeley AI Research team led by PhD candidate Jiayi Pan has achieved what many thought impossible: reproducing DeepSeek R1-Zero's key technologies for less than the cost of a dinner for two.

A Berkeley AI Research team led by PhD candidate Jiayi Pan has achieved what many thought impossible: reproducing DeepSeek R1-Zero's key technologies for less than the cost of a dinner for two. Using the countdown game as their testing ground, the team demonstrated that even modest language models can develop complex problem-solving strategies through reinforcement learning. Surprisingly, the choice of reinforcement learning algorithm (PPO, GRPO, or PRIME) proved less critical than expected, with all approaches achieving similar results.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of berkeley

berkeley

Photo of berkeley researchers

berkeley researchers

Photo of core tech

core tech

Related news:

News photo

DeepSeek-R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost

News photo

Intolerable Genius: Berkeley's Most Controversial Nobel Laureate

News photo

Taiwan economic minister says TSMC's core tech will not leave country | There are concerns the Trump Administration could pressure TSMC to move its most advanced technology to the US