Get the latest tech news

Meta's AI Safety System Defeated By the Space Bar


Thomas Claburn reports via The Register: Meta's machine-learning model for detecting prompt injection attacks -- special prompts to make neural networks behave inappropriately -- is itself vulnerable to, you guessed it, prompt injection attacks. Prompt-Guard-86M, introduced by Meta last week in conj...

Prompt-Guard-86M, introduced by Meta last week in conjunction with its Llama 3.1 generative model, is intended "to help developers detect and respond to prompt injection and jailbreak inputs," the social network giant said. Large language models (LLMs) are trained with massive amounts of text and other data, and may parrot it on demand, which isn't ideal if the material is dangerous, dubious, or includes personal info. "The bypass involves inserting character-wise spaces between all English alphabet characters in a given prompt," explained Priyanshu in a GitHub Issues post submitted to the Prompt-Guard repo on Thursday.

Get the Android app

Or read this on Slashdot

Read more on:

Photo of Meta

Meta

Photo of AI safety system

AI safety system

Photo of space bar

space bar

Related news:

News photo

Meta to cough up $1.4B to end fight over 'unlawful' facial recognition of friends

News photo

Meta to pay Texas $1.4B for using facial recognition without users' permission

News photo

Meta will pay $1.4 billion to Texas, settling biometric data collection suit