Get the latest tech news

Evaluation of Claude Mythos Preview's cyber capabilities


We conducted cyber evaluations of Anthropic’s Claude Mythos Preview and found continued improvement in capture-the-flag (CTF) challenges and significant improvement on multi-step cyber-attack simulations.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of evaluation

evaluation

Photo of cyber capabilities

cyber capabilities

Related news:

News photo

Cybersecurity stocks slumped on Friday following a report that Anthropic is testing a powerful new artificial intelligence model that is more advanced in cyber capabilities and also presents potential security risks.

News photo

Given Open AI’s most recent round of funding. What do you think they would have to ACTUALLY deliver to justify their evaluation?

News photo

Comptime – C# meta-programming with compile-time code generation and evaluation