Get the latest tech news

Meta got caught gaming AI benchmarks

Meta released Llama 4, but the announcement was eclipsed by benchmark drama.

Maverick quickly secured the number-two spot on LMArena, the AI benchmark site where humans compare outputs from different systems and vote on the best one. VP of generative AI at Meta, Ahmad Al-Dahle, addressed the accusations in a post on X: “We’ve also heard claims that we trained on test sets -- that’s simply not true and we would never do that. According to a recent report from The Information, the company repeatedly pushed back the launch due to the model failing to meet internal expectations.

Get the Android app

Or read this on Hacker News