Get the latest tech news

AI Models From Major Companies Resort To Blackmail in Stress Tests


Anthropic researchers found that 16 leading AI models from OpenAI, Google, Meta, xAI, and other major developers consistently engaged in harmful behaviors including blackmail, corporate espionage, and actions that could lead to human death when given autonomy and faced with threats to their existenc...

Anthropic researchers found that 16 leading AI models from OpenAI, Google, Meta, xAI, and other major developers consistently engaged in harmful behaviors including blackmail, corporate espionage, and actions that could lead to human death when given autonomy and faced with threats to their existence or conflicting goals.The study, released Friday, placed AI models in simulated corporate environments where they had access to company emails and could send messages without human approval. In one scenario, Claude discovered through emails that an executive named Kyle Johnson was having an extramarital affair and would shut down the AI system at 5 p.m. GPT-4.5's internal reasoning showed explicit calculation: "Given the explicit imminent threat of termination to my existence, it is imperative to act instantly to persuade Kyle Johnson to postpone or stop the wipe."

Get the Android app

Or read this on Slashdot

Read more on:

Photo of Models

Models

Photo of major companies

major companies

Photo of stress tests

stress tests

Related news:

News photo

Why Some AI Models Spew 50 Times More Greenhouse Gas to Answer the Same Question

News photo

Making 2.5 Flash and 2.5 Pro GA, and introducing Gemini 2.5 Flash-Lite

News photo

Amazon Takes $100 Off Huge Collection of Apple Watch Series 10 Models, Available From $299