Get the latest tech news

Did xAI lie about Grok 3’s benchmarks?


OpenAI researchers accused xAI about publishing misleading Grok 3 benchmarks. The truth is a little more nuanced.

In a post on xAI’s blog, the company published a graph showing Grok 3’s performance on AIME 2025, a collection of challenging math questions from a recent invitational mathematics exam. But OpenAI employees on X were quick to point out that xAI’s graph didn’t include o3-mini-high’s AIME 2025 score at “cons@64.” But as AI researcher Nathan Lambert pointed out in a post, perhaps the most important metric remains a mystery: the computational (and monetary) cost it took for each model to achieve its best score.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of Grok

Grok

Photo of benchmarks

benchmarks

Related news:

News photo

Benchmarks: Excellent Power Efficiency With 5th Gen AMD EPYC Using amd-pstate & Power Profiles

News photo

How ‘Based’ Is Grok 3? + Robinhood C.E.O. Vlad Tenev on Markets for Everything + Vibecoding 101

News photo

xAI's Grok 3 is available for free to everyone 'for a short time'