Get the latest tech news

ReasoningGym: Reasoning Environments for RL with Verifiable Rewards


We introduce Reasoning Gym (RG), a library of reasoning environments for reinforcement learning with verifiable rewards. It provides over 100 data generators and verifiers spanning multiple domains including algebra, arithmetic, computation, cognition, geometry, graph theory, logic, and various common games. Its key innovation is the ability to generate virtually infinite training data with adjustable complexity, unlike most previous reasoning datasets, which are typically fixed. This procedural generation approach allows for continuous evaluation across varying difficulty levels. Our experimental results demonstrate the efficacy of RG in both evaluating and reinforcement learning of reasoning models.

It provides over 100 data generators and verifiers spanning multiple domains including algebra, arithmetic, computation, cognition, geometry, graph theory, logic, and various common games. Its key innovation is the ability to generate virtually infinite training data with adjustable complexity, unlike most previous reasoning datasets, which are typically fixed. Our experimental results demonstrate the efficacy of RG in both evaluating and reinforcement learning of reasoning models.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of environments

environments

Photo of reasoninggym

reasoninggym

Photo of verifiable rewards

verifiable rewards

Related news:

News photo

PDCurses – for environments that don't fit the termcap/terminfo model

News photo

Learning happens in environments optimized for understanding, not winning

News photo

Holodeck: Language Guided Generation of 3D Embodied AI Environments