Get the latest tech news

AI models can acquire backdoors from surprisingly few malicious documents

Anthropic study suggests “poison” training attacks don’t scale with model size.

None

Get the Android app

Or read this on ArsTechnica

Related news:

Researchers find just 250 malicious documents can leave LLMs vulnerable to backdoors

Anthropic's open-source safety tool found AI models whisteblowing - in all the wrong places

AI models tend to flatter users, and that praise makes people more convinced that they're right and less willing to resolve conflicts, recent research suggests

« Why Big Car Brands Are Suddenly Turning Their Backs on Apple’s CarPlay Ultra

The Vision Pro is getting its first live ‘immersive’ sports »