Get the latest tech news

OpenAI Has Trained Its LLM To Confess To Bad Behavior

An anonymous reader quotes a report from MIT Technology Review: OpenAI is testing another new way to expose the complicated processes at work inside large language models. Researchers at the company can make an LLM produce what they call a confession, in which the model explains how it carried out a...

None

Get the Android app

Or read this on Slashdot