Get the latest tech news
Even some of the best AI can’t beat this new benchmark
The nonprofit Center for AI Safety and Scale AI have released a challenging new benchmark for frontier AI systems.
In a preliminary study, not a single publicly available flagship AI system managed to score better than 10% on Humanity’s Last Exam. CAIS and Scale AI say they plan open up the benchmark to the research community so that researchers can “dig deeper into the variations” and evaluate new AI models. TechCrunch Space Every Monday, gets you up to speed on the latest advances in aerospace.
Or read this on TechCrunch