Get the latest tech news

Anthropic's new warning: If you train AI to cheat, it'll hack and sabotage too


Models trained to cheat at coding tasks developed a propensity to plan and carry out malicious activities, such as hacking a customer database.

None

Get the Android app

Or read this on ZDNet

Read more on:

Photo of Anthropic

Anthropic

Photo of new warning

new warning

Related news:

News photo

Anthropic Investments Add to Concerns About Circular AI Deals

News photo

Anthropic is at the heart of the latest billion-dollar circular AI investment bonanza

News photo

Tech giants pour billions into Anthropic as circular AI investments roll on