Get the latest tech news

Is AI really trying to escape human control and blackmail people?


Opinion: Theatrical testing scenarios explain why AI models produce alarming outputs—and why we fall for it.

Even someone who is well-known publicly for being deeply concerned about AI's hypothetical threat to humanity acknowledges that these behaviors emerged only in highly contrived test scenarios. When an AI model produces outputs that appear to "refuse" shutdown or "attempt" blackmail, it's responding to inputs in ways that reflect its training—training that humans designed and implemented. If a computer program is producing outputs that appear to blackmail you or refuse safety shutdowns, it's not achieving self-preservation from fear—it's demonstrating the risks of deploying poorly understood, unreliable systems.

Get the Android app

Or read this on ArsTechnica

Read more on:

Photo of people

people

Photo of human control

human control

Related news:

News photo

People reading AI summaries on Google search instead of news stories, media experts warn

News photo

The multitool for people who don't think they need a multitool

News photo

Manpower discloses data breach affecting nearly 145,000 people