Get the latest tech news

I compared Claude Opus 4.8 with 4.7 in a 10-round honesty test - and a legal prompt broke it

I tested Opus 4.8 against 4.7 using coding, medical, finance, and legal traps, then cross-checked the results with multiple AIs.

None

Get the Android app

Or read this on ZDNet

Read more on:

Photo of Claude Opus 4.8

Claude Opus 4.8

Photo of legal test

legal test

Photo of honesty traps

honesty traps

Related news:

Show HN: Zot – Yet another coding agent harness

Anthropic releases Claude Opus 4.8, promising a more honest model

Claude Opus 4.8

« SK hynix to double memory wafer capacity within five years, chairman says — AI-driven shortage will persist until at least 2030

CISA flags two-year-old Oracle flaw as actively exploited in attacks »