Get the latest tech news

Inside the US Government's Unpublished Report on AI Safety


The US government conducted a groundbreaking study on frontier models—and never published the results.

At a computer security conference in Arlington, Virginia, last October, a few dozen AI researchers took part in a first-of-its-kind exercise in “red teaming,” or stress-testing a cutting-edge language model and other artificial intelligence systems. The National Institute of Standards and Technology (NIST) didn’t publish a report detailing the exercise, which was finished toward the end of the Biden administration. The researchers discovered various tricks for getting the models and tools tested to jump their guardrails and generate misinformation, leak personal data, and help craft cybersecurity attacks.

Get the Android app

Or read this on Wired

Read more on:

Photo of Biden administration

Biden administration

Photo of AI safety

AI safety

Photo of unpublished report

unpublished report

Related news:

News photo

Chain of thought monitorability: A new and fragile opportunity for AI safety

News photo

Protesters accuse Google of violating its promises on AI safety: 'AI companies are less regulated than sandwich shops'

News photo

The OpenAI Files: Ex-staff claim profit greed betraying AI safety