Get the latest tech news

AI Systems Are Learning to Lie and Deceive, Scientists Find


AI models are, apparently, getting better at lying on purpose. Two recent studies — one published this week in the journal PNAS and the other last month in the journal Patterns — reveal some jarring findings about large language models (LLMs) and their ability to lie or deceive human observers on purpose. In the PNAS paper, German AI ethicist Thilo […]

In the PNAS paper, German AI ethicist Thilo Hagendorff goes so far as to say that sophisticated LLMs can be encouraged to elicit "Machiavellianism," or intentional and amoral manipulativeness, which "can trigger misaligned deceptive behavior." "GPT- 4, for instance, exhibits deceptive behavior in simple test scenarios 99.16% of the time," the University of Stuttgart researcher writes, citing his own experiments in quantifying various "maladaptive" traits in 10 different LLMs, most of which are different versions within OpenAI's GPT family. Led by Massachusetts Institute of Technology postdoctoral researcher Peter Park, that paper found that Cicero not only excels at deception, but seems to have learned how to lie the more it gets used — a state of affairs "much closer to explicit manipulation" than, say, AI's propensity for hallucination, in which models confidently assert the wrong answers accidentally.

Get the Android app

Or read this on r/technology

Read more on:

Photo of Scientists

Scientists

Photo of systems

systems

Related news:

News photo

Frontier Communications: 750k people's data stolen in April attack on systems

News photo

Scientists enlist AI to interpret meaning of barks

News photo

Scientists develop ultra-thin battery for smart contact lenses that could be charged by tears