Get the latest tech news

New study shows why simulated reasoning AI models don’t yet live up to their billing


Top AI models excel at math problems but lack reasoning needed for Math Olympiad proofs.

There's a curious contradiction at the heart of today's most capable AI models that purport to "reason": They can solve routine math problems with impressive accuracy, yet when faced with formulating deeper mathematical proofs found in competition-level challenges, they often fail. The AI outputs contained logical gaps where mathematical justification was lacking, included arguments based on unproven assumptions, and continued producing incorrect approaches despite generating contradictory results. As LLM research engineer Sebastian Raschka explains in a blog post, "Reasoning models either explicitly display their thought process or handle it internally, which helps them to perform better at complex tasks," like mathematical problems.

Get the Android app

Or read this on ArsTechnica

Read more on:

Photo of new study

new study

Photo of billing

billing

Related news:

News photo

New study of birds shows citizen science can be trusted

News photo

FTC Sues Uber Over Billing for Its Uber One Subscription Service

News photo

A new study finds that ChatGPT mirrors human decision-making biases in nearly half of tested scenarios, including overconfidence and the gambler’s fallacy