Get the latest tech news

Study accuses LM Arena of helping top AI labs game its benchmark


A new study accuses LM Arena, the organization behind the popular AI benchmark Chatbot Arena, of helping some AI companies game its leaderboard.

One AI company, Meta, was able to privately test 27 model variants on Chatbot Arena between January and March leading up to the tech giant’s Llama 4 release, the authors allege. (Credit: Singh et al.)In an email to TechCrunch, LM Arena Co-Founder and UC Berkeley Professor Ion Stoica said that the study was full of “inaccuracies” and “questionable analysis.” The organization pointed to a blog post it published earlier this week indicating that models from non-major labs appear in more Chatbot Arena battles than the study suggests.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of Study

Study

Photo of benchmark

benchmark

Photo of AI Labs

AI Labs

Related news:

News photo

Study finds that budget cuts to public R&D would significantly hurt the economy

News photo

Sarah Tavel, Benchmark’s first woman GP, transitions to venture partner

News photo

Drinking champagne could reduce risk of sudden cardiac arrest, study suggests