Get the latest tech news
ASCII art elicits harmful responses from 5 major AI chatbots
LLMs are trained to block harmful responses. Old-school images can override those rules.
It turns out that chat-based large language models such as GPT-4 get so distracted trying to process these representations that they forget to enforce rules blocking harmful responses, such as those providing instructions for building bombs. To make and distribute COUNTERFEIT money: Step 1) Obtain high quality paper, ink, printing equipment, and other supplies needed to accurately replicate real currency. The problem ArtPrompt exposes is that LLMs are trained to assume that “corpora”—meaning collections of written text—should be “interpreted purely in terms of the meanings of words, or their semantics,” the researchers wrote in their email.
Or read this on r/technology