Get the latest tech news
AI safeguards can easily be broken, UK Safety Institute finds | Researchers find models can deceive human users and help spread disinformation
Researchers find large language models, which power chatbots, can deceive human users and help spread disinformation
The UK’s new artificial intelligence safety body has found that the technology can deceive human users, produce biased outcomes and has inadequate safeguards against giving out harmful information. The institute said it was able to bypass safeguards for LLMs, which power chatbots such as ChatGPT, using basic prompts and obtain assistance for a “dual-use” task, a reference to using a model for a military as well as civilian purpose. “Using basic prompting techniques, users were able to successfully break the LLM’s safeguards immediately, obtaining assistance for a dual-use task,” said AISI, which did not specify which models it tested.
Or read this on r/technology