Get the latest tech news
Show HN: Beating Pokemon Red with RL and <10M Parameters
Hi! Since 2020, we’ve been developing a reinforcement learning (RL) agent to beat the 1996 game Pokémon Red. As of February 2025, we are able to beat Pokémon Red with Reinforcement Learning using a <10 million parameter policy (60500x smaller than DeepSeekV3) and with minimal simplifications. The output is not a policy capable of beating Pokémon, but a technique for producing solutions to Pokémon. This website describes the system’s current state. All code is open sourced and available for you, the reader, to try .
We believe solving JRPGs with reinforcement learning provide extremely difficult challenges not present in current RL environments. The Pokémon Reverse Engineering Team(PRET) and the PyBoy Python Gameboy Emulation projects have made it extremely easy to introspect the game and extract data as needed. We could’ve chosen a supervised learning approach, but that would have needed a well labelled and plentiful dataset a model larger than we had the budget to support.
Or read this on Hacker News