Get the latest tech news

OpenAI has trained its LLM to confess to bad behavior


Large language models often lie and cheat. We can’t stop that—but we can make them own up.

None

Get the Android app

Or read this on r/technology

Read more on:

Photo of OpenAI

OpenAI

Photo of LLM

LLM

Photo of bad behavior

bad behavior

Related news:

News photo

Enthusiasm for OpenAI’s Sora Fades After Initial Creative Burst

News photo

OpenAI Calls a ‘Code Red’ + Which Model Should I Use? + The Hard Fork Review of Slop

News photo

'Godfather of AI' Geoffrey Hinton says Google is 'beginning to overtake' OpenAI: 'My guess is Google will win'