Get the latest tech news
Results of "Humanity's Last Exam" benchmark published
Scale AI and the Center for AI Safety (CAIS) are proud to publish the results of Humanity’s Last Exam.
“We wanted problems that would test the capabilities of the models at the frontier of human knowledge and reasoning,” said Dan Hendrycks, CAIS co-founder and executive director. Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. “By identifying the gaps in AI’s reasoning capabilities, Humanity’s Last Exam not only benchmarks current systems but also provides a roadmap for future research and development,” said Yue.
Or read this on Hacker News