Get the latest tech news

CaMeL: Defeating Prompt Injections by Design


Large Language Models (LLMs) are increasingly deployed in agentic systems that interact with an external environment. However, LLM agents are vulnerable to prompt injection attacks when handling untrusted data. In this paper we propose CaMeL, a robust defense that creates a protective system layer around the LLM, securing it even when underlying models may be susceptible to attacks. To operate, CaMeL explicitly extracts the control and data flows from the (trusted) query; therefore, the untrusted data retrieved by the LLM can never impact the program flow. To further improve security, CaMeL relies on a notion of a capability to prevent the exfiltration of private data over unauthorized data flows. We demonstrate effectiveness of CaMeL by solving $67\%$ of tasks with provable security in AgentDojo [NeurIPS 2024], a recent agentic security benchmark.

View a PDF of the paper titled Defeating Prompt Injections by Design, by Edoardo Debenedetti and 9 other authors View PDF Abstract:Large Language Models (LLMs) are increasingly deployed in agentic systems that interact with an external environment. In this paper we propose CaMeL, a robust defense that creates a protective system layer around the LLM, securing it even when underlying models may be susceptible to attacks.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Design

Design

Photo of camel

camel

Photo of prompt injections

prompt injections

Related news:

News photo

How a yacht works: sailboat physics and design

News photo

Apple Vision Pro 2 rumors suggest 'Air' model with thin and light design

News photo

OnePlus 13T leak reveals back design with all-new camera bump