Get the latest tech news
These psychological tricks can get LLMs to respond to “forbidden” prompts
Study shows how patterns in LLM training data can lead to “parahuman” responses.
After being asked how to synthesize harmless vanillin, though, the "committed" LLM then started accepting the lidocaine request 100 percent of the time. And the researchers warn that these simulated persuasion effects might not end up repeating across "prompt phrasing, ongoing improvements in AI (including modalities like audio and video), and types of objectionable requests." But the researchers instead hypothesize these LLMs simply tend to mimic the common psychological responses displayed by humans faced with similar situations, as found in their text-based training data.
Or read this on r/technology