Get the latest tech news

OpenAI is training models to 'confess' when they lie - what it means for future AI


A new study made a version of GPT-5 Thinking admit its own misbehavior. But it's not a quick fix for bigger safety issues.

None

Get the Android app

Or read this on ZDNet

Read more on:

Photo of OpenAI

OpenAI

Photo of training models

training models

Photo of future AI

future AI

Related news:

News photo

OpenAI, NextDC Plan to Build $4.6 Billion Sydney Data Center

News photo

The 'truth serum' for AI: OpenAI’s new method for training models to confess their mistakes

News photo

OpenAI turns the screws on chatbots to get them to confess mischief