brutal new Agents

Read news on brutal new Agents with our app.

Read more in the app

Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agents’ Last Exam benchmark

« org apologizes

Exam benchmark »