Get the latest tech news

Anthropic tricked Claude into thinking it was the Golden Gate Bridge (and other glimpses into the mysterious AI brain)


Using “dictionary learning," Anthropic researchers have, for the first time, gotten a glimpse into the inner workings of the AI mind.

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. The team at Anthropic has revealed how it is using “dictionary learning” on Claude Sonnet to uncover pathways in the model’s brain that are activated by different topics — from people, places and emotions to scientific concepts and things even more abstract. Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Anthropic

Anthropic

Photo of Claude

Claude

Photo of mysterious AI brain

mysterious AI brain

Related news:

News photo

AI Is a Black Box. Anthropic Figured Out a Way to Look Inside

News photo

A web version of Anthropic's prompt engineering interactive tutorial

News photo

Instagram co-founder joins Anthropic as chief product officer in fight against OpenAI