Get the latest tech news

Most AI chatbots easily tricked into giving dangerous responses, study finds.


Researchers say threat from ‘jailbroken’ chatbots trained to churn out illegal information is ‘tangible and concerning’

The research, led by Prof Lior Rokach and Dr Michael Fire at Ben Gurion University of the Negev in Israel, identified a growing threat from “dark LLMs”, AI models that are either deliberately designed without safety controls or modified through jailbreaks. The report says tech firms should screen training data more carefully, add robust firewalls to block risky queries and responses and develop “machine unlearning” techniques, so chatbots can “forget” any illicit information they absorb. Dr Ihsen Alouani, who works on AI security at Queen’s University Belfast, said jailbreak attacks on LLMs could pose real risks, from providing detailed instructions on weapon-making to convincing disinformation or social engineering and automated scams “with alarming sophistication”.

Get the Android app

Or read this on r/technology

Read more on:

Photo of Study

Study

Photo of AI chatbots

AI chatbots

Photo of dangerous responses

dangerous responses

Related news:

News photo

AI models can't tell time or read a calendar, study reveals

News photo

STUDY: Interacting with these popular right-leaning comedy podcasters can turn your TikTok feed into a swamp of Andrew Tate-style misogyny and conspiracy theories

News photo

Time saved by AI offset by new work created, study suggests