Get the latest tech news

Researchers find you don’t need a ton of data to train LLMs for reasoning tasks

With a few hundred well-curated examples, an LLM can be trained for complex reasoning tasks that previously required thousands of instances.

More recently, researchers have shown that pure reinforcement learning approaches can enable models to train themselves for reasoning tasks by generating many solutions and choosing the ones that work best. On the other hand, crafting a few hundred examples is an endeavor that many companies can tackle, bringing specialized reasoning models within the reach of a wider range of organizations. “We hypothesize that successful reasoning emerges from the synergy of these two factors: rich pre-trained knowledge and sufficient computational resources at inference time,” the researchers write.

Get the Android app

Or read this on Venture Beat