Get the latest tech news

Why Anthropic's Claude still hasn't beaten Pokémon


Weeks later, Sonnet’s “reasoning” model is struggling with a game designed for children.

In recent months, the AI industry's biggest boosters have started converging on a public expectation that we're on the verge of “artificial general intelligence”(AGI)—virtual agents that can match or surpass "human-level" understanding and performance on most cognitive tasks. That breakthrough, Anthropic wrote, was because the “extended thinking” by Claude 3.7 Sonnet means the new model "plans ahead, remembers its objectives, and adapts when initial strategies fail" in a way that its predecessors didn’t. Like a conspiracy theorist who builds an entire worldview from an inherently flawed premise, Claude can be incredibly slow to recognize when an error in its self-authored knowledge base is leading its Pokémon play astray.

Get the Android app

Or read this on r/technology

Read more on:

Photo of Pokémon

Pokémon

Photo of Weeks

Weeks

Photo of game

game

Related news:

News photo

Claude Tried to Nuke My Home Directory

News photo

Remembering Assassin's Creed Origins, the game which opened out the mysteries of the Great Pyramid

News photo

VibeSail: A case study in vibe coding a game