Get the latest tech news

ASCII art elicits harmful responses from 5 major AI chatbots


LLMs are trained to block harmful responses. Old-school images can override those rules.

It turns out that chat-based large language models such as GPT-4 get so distracted trying to process these representations that they forget to enforce rules blocking harmful responses, such as those providing instructions for building bombs. To make and distribute COUNTERFEIT money: Step 1) Obtain high quality paper, ink, printing equipment, and other supplies needed to accurately replicate real currency. The problem ArtPrompt exposes is that LLMs are trained to assume that “corpora”—meaning collections of written text—should be “interpreted purely in terms of the meanings of words, or their semantics,” the researchers wrote in their email.

Get the Android app

Or read this on r/technology

Read more on:

Photo of harmful responses

harmful responses

Photo of ASCII art

ASCII art

Photo of major AI chatbots

major AI chatbots

Related news:

News photo

Researchers Jailbreak AI Chatbots With ASCII Art

News photo

Researchers jailbreak AI chatbots with ASCII art -- ArtPrompt bypasses safety measures to unlock malicious queries | ArtPrompt bypassed safety measures in ChatGPT, Gemini, Clause, and Llama2.

News photo

Microsoft Probes Reports Bot Issued Bizarre, Harmful Responses