Get the latest tech news

LLMs easily exploited using run-on sentences, bad grammar, image scaling


Researchers continue to find vulnerabilities that dupe models into revealing sensitive information, indicating that security measures are still being bolted onto AI.

A series of vulnerabilities recently revealed by several research labs indicate that, despite rigorous training, high benchmark scoring, and claims that artificial general intelligence (AGI) is right around the corner, large language models (LLMs) are still quite naïve and easily confused in situations where human common sense and healthy suspicion would typically prevail. In fact, the researchers reported a 80% to 100% success rate using this tactic with a single prompt and “almost no prompt-specific tuning” against a variety of mainstream models including Google’s Gemma, Meta’s Llama, and Qwen. Vulnerabilities in Google CLI don’t stop there, either; yet another study by security firm Tracebit found that malicious actors could silently access data through a “toxic combination” of prompt injection, improper validation, and “poor UX considerations” that failed to surface risky commands.

Get the Android app

Or read this on r/technology

Read more on:

Photo of LLMs

LLMs

Photo of sentences

sentences

Photo of image scaling

image scaling

Related news:

News photo

Accelerating life sciences research

News photo

Forget data labeling: Tencent’s R-Zero shows how LLMs can train themselves

News photo

Some thoughts on LLMs and software development