Get the latest tech news

OpenAI’s o3 shows remarkable progress on ARC-AGI, sparking debate on AI reasoning

o3 solved one of the most difficult AI challenges, scoring 75.7% on the ARC-AGI benchmark. But does it really mean we're closer to AGI?

In a blog post, François Chollet, the creator of ARC, described o3’s performance as “a surprising and important step-function increase in AI capabilities, showing novel task adaptation ability never seen before in the GPT-family models.” Performance of different models on ARC-AGI (source: arcprize.org) “This is not merely incremental improvement, but a genuine breakthrough, marking a qualitative shift in AI capabilities compared to the prior limitations of LLMs,” Chollet wrote. Other scientists such as Nathan Lambert from the Allen Institute for AI suggest that “o1 and o3 can actually be just the forward passes from one language model.” On the day o3 was announced, Nat McAleese, a researcher at OpenAI, posted on X that o1 was “just an LLM trained with RL.

Get the Android app

Or read this on Venture Beat