Get the latest tech news

Even some of the best AI can’t beat this new benchmark


The nonprofit Center for AI Safety and Scale AI have released a challenging new benchmark for frontier AI systems.

In a preliminary study, not a single publicly available flagship AI system managed to score better than 10% on Humanity’s Last Exam. CAIS and Scale AI say they plan open up the benchmark to the research community so that researchers can “dig deeper into the variations” and evaluate new AI models. TechCrunch Space Every Monday, gets you up to speed on the latest advances in aerospace.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of new benchmark

new benchmark

Photo of best AI

best AI

Related news:

News photo

Google DeepMind researchers introduce new benchmark to improve LLM factuality, reduce hallucinations

News photo

A new benchmark for AI investment: Swift Ventures unveils system to separate talk from action

News photo

A New Benchmark for the Risks of AI