Get the latest tech news

AI Agents Are Getting Better at Writing Code—and Hacking It as Well

One of the best bug-hunters in the world is an AI tool called Xbow, just one of many signs of the coming age of cybersecurity automation.

The UC Berkeley team tested conventional frontier AI models from OpenAI, Google, and Anthropic, as well as open source offerings from Meta, DeepSeek, and Alibaba combined with several agents for finding bugs, including OpenHands, Cybench, and EnIGMA. They then fed the descriptions to the cybersecurity agents powered by frontier AI models to see if they could identify the same flaws for themselves by analyzing new codebases, running tests, and crafting proof-of-concept exploits. The work adds to growing evidence that AI can automate the discovery of zero-day vulnerabilities, which are potentially dangerous (and valuable) because they may provide a way to hack live systems.

Get the Android app

Or read this on Wired