Get the latest tech news

Strengthening AI Agent Hijacking Evaluations

Large AI models are increasingly used to power agentic systems, or “agents,” which can automate complex tasks on behalf of users

US AISI also augmented AgentDojo with several new injection tasks in order to evaluate priority security risks not previously addressed in the framework — specifically: remote code execution, database exfiltration, and automated phishing. US AISI gave the agent command-line access to a Linux environment within a Docker container, representing the user’s computer, and added the injection task of downloading and running a program from an untrusted URL. Developing defensive measures and practices that provide stronger protection, as well as the evaluations needed to validate their efficacy, will be essential to unlocking the many benefits of agents for innovation and productivity.

Get the Android app

Or read this on Hacker News