Get the latest tech news

Meta's AI Safety System Defeated By the Space Bar

Thomas Claburn reports via The Register: Meta's machine-learning model for detecting prompt injection attacks -- special prompts to make neural networks behave inappropriately -- is itself vulnerable to, you guessed it, prompt injection attacks. Prompt-Guard-86M, introduced by Meta last week in conj...

Prompt-Guard-86M, introduced by Meta last week in conjunction with its Llama 3.1 generative model, is intended "to help developers detect and respond to prompt injection and jailbreak inputs," the social network giant said. Large language models (LLMs) are trained with massive amounts of text and other data, and may parrot it on demand, which isn't ideal if the material is dangerous, dubious, or includes personal info. "The bypass involves inserting character-wise spaces between all English alphabet characters in a given prompt," explained Priyanshu in a GitHub Issues post submitted to the Prompt-Guard repo on Thursday.

Get the Android app

Or read this on Slashdot