AI models can acquire backdoors from surprisingly few malicious documents
Anthropic's open-source safety tool found AI models whisteblowing - in all the wrong places
AI models tend to flatter users, and that praise makes people more convinced that they're right and less willing to resolve conflicts, recent research suggests