Get the latest tech news

Anthropic’s New Model Excels at Reasoning and Planning—and Has the Pokémon Skills to Prove It

When Anthropic's older Claude model played Pokemon Red, it spent “dozens of hours” stuck in one city and had trouble identifying non-player characters. With Claude 4 Opus, they noticed an improvement in Claude’s long-term memory and planning capabilities.

The new models, which jump the naming convention from 3.7 straight to 4, have a number of strengths, including their ability to reason, plan, and remember the context of conversations over extended periods of time, the company says. In an interview with WIRED, Hershey says he chose Pokémon Red because it’s “a simple playground,” meaning the game is turn-based and doesn’t require real time reactions, which Anthropic’s current models struggle with. “Over time, I have been going through and deleting all of the Pokémon-specific stuff I can just because I think it’s really interesting to see how much the model can figure out on its own,” Hershey says, adding that he hopes to build a game that Claude has never seen before in order to truly test its limits.

Get the Android app

Or read this on Wired