Exam benchmark

Read news on Exam benchmark with our app.

Read more in the app

Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agents’ Last Exam benchmark