Get the latest tech news

Anthropic overtakes OpenAI: Claude Opus 4 codes seven hours nonstop, sets record SWE-Bench score and reshapes enterprise AI


Anthropic's Claude Opus 4 outperforms OpenAI's GPT-4.1 with unprecedented seven-hour autonomous coding sessions and record-breaking 72.5% SWE-bench score, transforming AI from quick-response tool to day-long collaborator.

The technological implications are profound: AI systems can now handle complex software engineering projects from conception to completion, maintaining context and focus throughout an entire workday. The technical implementation works similarly to how human experts develop knowledge management systems, with the AI automatically organizing information into structured formats optimized for future retrieval. Their study found Claude 3.7 Sonnet mentioned crucial hints it used to solve problems only 25% of the time — raising significant questions about the transparency of AI reasoning.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of OpenAI

OpenAI

Photo of hours

hours

Photo of codes

codes

Related news:

News photo

What Sam Altman told OpenAI about the device he's making with Jony Ive

News photo

OpenAI teams up with Cisco, Oracle to build UAE data center

News photo

OpenAI Says It Will Build Massive New Data Center in the U.A.E.