Get the latest tech news

An analysis of DeepSeek's R1-Zero and R1


An analysis of Deepseek's R1

We launched ARC Prize 2024 last June to grow awareness of limits of scaling LLMs and promote a useful benchmark, ARC-AGI-1, towards a new direction that requires AI systems to adapt to novel, unseen problems instead of being able to rely strictly on memorization. Label the intermediary CoT steps using a combination of human experts (“supervised fine tuning” or SFT) and automated machines (“reinforcement learning” or RL). It is important to watch whether SFT ends up being a requirement to add CoT search and sampling, or whether a hypothetical “R2-Zero” could exist along the same logarithmic accuracy vs inference scaling curve.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of analysis

analysis

Photo of DeepSeek

DeepSeek

Related news:

News photo

OpenAI Says DeepSeek May Have Improperly Harvested Its Data

News photo

Show HN: DeepSeek vs. ChatGPT – The Clash of the AI Generations

News photo

On DeepSeek and Export Controls