Get the latest tech news

Teaching Claude Why


why Last year, we released a case study on agentic misalignment. In experimental scenarios, we showed that AI models from many different developers sometimes took egregiously misaligned actions when they encountered (fictional) ethical dilemmas.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Claude

Claude

Related news:

News photo

Natural Language Autoencoders: Turning Claude's Thoughts into Text

News photo

Claude hitches ride on SpaceX's datacenter capacity

News photo

Anthropic wants Claude to play with money, unleashes finance agents