Get the latest tech news
Meta got caught gaming AI benchmarks
Meta released Llama 4, but the announcement was eclipsed by benchmark drama.
Maverick quickly secured the number-two spot on LMArena, the AI benchmark site where humans compare outputs from different systems and vote on the best one. VP of generative AI at Meta, Ahmad Al-Dahle, addressed the accusations in a post on X: “We’ve also heard claims that we trained on test sets -- that’s simply not true and we would never do that. According to a recent report from The Information, the company repeatedly pushed back the launch due to the model failing to meet internal expectations.
Or read this on Hacker News