Get the latest tech news

Show HN: Beating Pokemon Red with RL and <10M Parameters


Hi! Since 2020, we’ve been developing a reinforcement learning (RL) agent to beat the 1996 game Pokémon Red. As of February 2025, we are able to beat Pokémon Red with Reinforcement Learning using a <10 million parameter policy (60500x smaller than DeepSeekV3) and with minimal simplifications. The output is not a policy capable of beating Pokémon, but a technique for producing solutions to Pokémon. This website describes the system’s current state. All code is open sourced and available for you, the reader, to try .

We believe solving JRPGs with reinforcement learning provide extremely difficult challenges not present in current RL environments. The Pokémon Reverse Engineering Team(PRET) and the PyBoy Python Gameboy Emulation projects have made it extremely easy to introspect the game and extract data as needed. We could’ve chosen a supervised learning approach, but that would have needed a well labelled and plentiful dataset a model larger than we had the budget to support.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of parameters

parameters

Photo of pokemon red

pokemon red

Related news:

News photo

Chinese AI startup DeepSeek unveils open-source model to rival #OpenAI o1. DeepSeek-R1 features 671 billion parameters and claims performance superiority to OpenAI’s o1 on key benchmarks. 👀

News photo

IANA's List of Domain Name System (DNS) Parameters

News photo

Stability AI brings 12B parameters to Stable LM 2 model update