Get the latest tech news
SWiRL: The business case for AI that thinks like your best problem-solvers
Training LLMs on trajectories of reasoning and tool use makes them superior at multi-step reasoning tasks.
The lead authors of the SWiRL paper, Anna Goldie, research scientist at Google DeepMind, and Azalia Mirhosseini, assistant professor of computer science at Stanford University, believe that current LLM training methods are not suited for the multi-step reasoning tasks that real-world applications require. Using this method, the team compiled large datasets based on questions from multi-hop question-answering (HotPotQA) and math problem-solving (GSM8K) benchmarks, generating tens of thousands of trajectories. “By training the model to take reasonable steps at each moment in time (and to do so in a coherent and potentially more explainable way), we address a core weakness of traditional LLMs, namely their brittleness in the face of complex, multi-step tasks, where the probability of success decays exponentially with path length,” Goldie and Mirhoseini said.
Or read this on Venture Beat