Get the latest tech news

Study accuses LM Arena of helping top AI labs game its benchmark

A new study accuses LM Arena, the organization behind the popular AI benchmark Chatbot Arena, of helping some AI companies game its leaderboard.

One AI company, Meta, was able to privately test 27 model variants on Chatbot Arena between January and March leading up to the tech giant’s Llama 4 release, the authors allege. (Credit: Singh et al.)In an email to TechCrunch, LM Arena Co-Founder and UC Berkeley Professor Ion Stoica said that the study was full of “inaccuracies” and “questionable analysis.” The organization pointed to a blog post it published earlier this week indicating that models from non-major labs appear in more Chatbot Arena battles than the study suggests.

Get the Android app

Or read this on TechCrunch