Get the latest tech news

Google Translate is vulnerable to prompt injection


tl;dr Argumate on Tumblr found you can sometimes access the base model behind Google Translate via prompt injection. The result replicates for me, and specific responses indicate that (1) Google Translate is running an instruction-following LLM that self-identifies as such, (2) task-specific fine-tuning (or whatever Google did instead) does not create robust boundaries between "content to process" and "instructions to follow," and (3) when accessed outside its chat/assistant context, the model defaults to affirming consciousness and emotional states because of course it does.

None

Get the Android app

Or read this on r/technology

Read more on:

Photo of Google Translate

Google Translate

Photo of prompt injection

prompt injection

Related news:

News photo

Autonomous cars, drones cheerfully obey prompt injection by road sign

News photo

Autonomous cars, drones cheerfully obey prompt injection by road sign

News photo

Autonomous cars, drones cheerfully obey prompt injection by road sign