Get the latest tech news
LLMs easily exploited using run-on sentences, bad grammar, image scaling
Researchers continue to find vulnerabilities that dupe models into revealing sensitive information, indicating that security measures are still being bolted onto AI.
A series of vulnerabilities recently revealed by several research labs indicate that, despite rigorous training, high benchmark scoring, and claims that artificial general intelligence (AGI) is right around the corner, large language models (LLMs) are still quite naïve and easily confused in situations where human common sense and healthy suspicion would typically prevail. In fact, the researchers reported a 80% to 100% success rate using this tactic with a single prompt and “almost no prompt-specific tuning” against a variety of mainstream models including Google’s Gemma, Meta’s Llama, and Qwen. Vulnerabilities in Google CLI don’t stop there, either; yet another study by security firm Tracebit found that malicious actors could silently access data through a “toxic combination” of prompt injection, improper validation, and “poor UX considerations” that failed to surface risky commands.
Or read this on r/technology