Get the latest tech news

OpenAI’s new reasoning AI models hallucinate more


OpenAI's reasoning AI models are getting better, but their hallucinating isn't, according to benchmark results.

“Our hypothesis is that the kind of reinforcement learning used for o-series models may amplify issues that are usually mitigated (but not fully erased) by standard post-training pipelines,” said Neil Chowdhury, a Transluce researcher and former OpenAI employee, in an email to TechCrunch. Kian Katanforoosh, a Stanford adjunct professor and CEO of the upskilling startup Workera, told TechCrunch that his team is already testing o3 in their coding workflows, and that they’ve found it to be a step above the competition. “Addressing hallucinations across all our models is an ongoing area of research, and we’re continually working to improve their accuracy and reliability,” said OpenAI spokesperson Niko Felix in an email to TechCrunch.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of OpenAI

OpenAI

Related news:

News photo

OpenAI details ChatGPT-o3, o4-mini, o4-mini-high usage limits

News photo

OpenAI pursued Cursor maker before entering into talks to buy Windsurf for $3B

News photo

Windsurf: OpenAI’s potential $3B bet to drive the ‘vibe coding’ movement