Get the latest tech news

A New Trick Could Block the Misuse of Open Source AI

Researchers have developed a way to tamperproof open source large language models to prevent them from being coaxed into, say, explaining how to make a bomb.

When Meta released its large language model Llama 3 for free this April, it took outside developers just a couple days to create a version without the safety restrictions that prevent it from spouting hateful jokes, offering instructions for cooking meth, or misbehaving in other ways. “Terrorists and rogue states are going to use these models,” Mantas Mazeika, a Center for AI Safety researcher who worked on the project as a PhD student at the University of Illinois Urbana-Champaign, tells WIRED. A report released this week by the National Telecommunications and Information Administration, a body within the US Commerce Department, “recommends the US government develop new capabilities to monitor for potential risks, but refrain from immediately restricting the wide availability of open model weights in the largest AI systems.”

Get the Android app

Or read this on Wired