Get the latest tech news

Results of "Humanity's Last Exam" benchmark published


Scale AI and the Center for AI Safety (CAIS) are proud to publish the results of Humanity’s Last Exam.

“We wanted problems that would test the capabilities of the models at the frontier of human knowledge and reasoning,” said Dan Hendrycks, CAIS co-founder and executive director. Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. “By identifying the gaps in AI’s reasoning capabilities, Humanity’s Last Exam not only benchmarks current systems but also provides a roadmap for future research and development,” said Yue.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Humanity

Humanity

Photo of scale

scale

Photo of exam

exam

Related news:

News photo

Scale AI CEO To Trump: 'America Must Win the AI War'

News photo

Leaving a permanent record of humanity on the moon – in 100B pixels

News photo

Distributed Transactions at Scale in Amazon DynamoDB (2023)