Get the latest tech news

These psychological tricks can get LLMs to respond to “forbidden” prompts

Study shows how patterns in LLM training data can lead to “parahuman” responses.

After being asked how to synthesize harmless vanillin, though, the "committed" LLM then started accepting the lidocaine request 100 percent of the time. And the researchers warn that these simulated persuasion effects might not end up repeating across "prompt phrasing, ongoing improvements in AI (including modalities like audio and video), and types of objectionable requests." But the researchers instead hypothesize these LLMs simply tend to mimic the common psychological responses displayed by humans faced with similar situations, as found in their text-based training data.

Get the Android app

Or read this on r/technology