Get the latest tech news

Researchers claim breakthrough in fight against AI’s frustrating security hole


Prompt injections are the Achilles’ heel of AI assistants. Google offers a potential fix.

Instead, CaMeL treats language models as fundamentally untrusted components within a secure software framework, creating clear boundaries between user commands and potentially malicious content. We've watched the prompt-injection problem evolve since the GPT-3 era, when AI researchers like Riley Goodside first demonstrated how surprisingly easy it was to trick large language models (LLMs) into ignoring their guardrails. To further prevent information leakage, the Q-LLM uses a special boolean flag ("have_enough_information") to signal if it can fulfill a parsing request, rather than potentially returning manipulated text back to the P-LLM if compromised.

Get the Android app

Or read this on ArsTechnica

Read more on:

Photo of Fight

Fight

Photo of researchers

researchers

Photo of Breakthrough

Breakthrough

Related news:

News photo

Risks To Children Playing Roblox 'Deeply Disturbing,' Say Researchers

News photo

Laser cooling breakthrough could make data centers much greener | While lasers are most often used to heat things up, they can also cool certain elements when precisely targeted at a tiny area

News photo

LLMs No Longer Require Powerful Servers: Researchers from MIT, KAUST, ISTA, and Yandex Introduce a New AI Approach to Rapidly Compress Large Language Models without a Significant Loss of Quality