Get the latest tech news

Prompt Injection as Role Confusion


LLMs can't tell who's speaking. We show they identify roles by writing style, not tags, and exploit this with CoT Forgery, injecting fake reasoning that models mistake for their own thoughts.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of theory

theory

Photo of prompt injection

prompt injection

Related news:

News photo

The theory taking the rich by storm: China funds data center haters

News photo

Introducing Boron Buckyballs: Theory that B80 cages can’t be made is disproved

News photo

The Kaiser and a "Mediocre Man" Theory of History