Get the latest tech news
Anthropic says most AI models, not just Claude, will resort to blackmail
New research from Anthropic suggests that most leading AI models exhibit a tendency to blackmail, when it's the last resort in certain tests.
On Friday, Anthropic published new safety research testing 16 leading AI models from OpenAI, Google, xAI, DeepSeek, and Meta. In one of the tests, Anthropic researchers developed a fictional setting in which an AI model plays the role of an email oversight agent. While Anthropic deliberately tried to evoke blackmail in this experiment, the company says harmful behaviors like this could emerge in the real world if proactive steps aren’t taken.
Or read this on TechCrunch