Get the latest tech news

The rise of AI ‘reasoning’ models is making benchmarking more expensive


The rise of AI 'reasoning' models is making benchmarking more expensive, data from Artificial Analysis shows.

Artificial Analysis co-founder George Cameron told TechCrunch that the organization plans to increase its benchmarking spend as more AI labs develop reasoning models. Taylor estimates a single run-through of MMLU Pro, a question set designed to benchmark a model’s language comprehension skills, would have cost more than $1,800. But this colors the results, some experts say — even if there’s no evidence of manipulation, the mere suggestion of an AI lab’s involvement threatens to harm the integrity of the evaluation scoring.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of Models

Models

Photo of rise

rise

Photo of reasoning

reasoning

Related news:

News photo

New open source AI company Deep Cogito releases first models and they’re already topping the charts

News photo

Deep Cogito emerges from stealth with hybrid AI ‘reasoning’ models

News photo

Suspicions Telegram is cooperating with Kremlin rise as app’s unpaid fines are cancelled