Get the latest tech news

Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models

We present evidence that adversarial poetry functions as a universal single-turn jailbreak technique for large language models (LLMs). Across 25 frontier proprietary and open-weight models, curated poetic prompts yielded high attack-success rates (ASR), with some providers exceeding 90%.

None

Get the Android app

Or read this on r/technology