Get the latest tech news

AI Is a Black Box. Anthropic Figured Out a Way to Look Inside


What goes on in artificial neural networks work is largely a mystery, even to their creators. But researchers from Anthropic have caught a glimpse.

Even the people who build them don’t know exactly how they work, and massive effort is required to create guardrails to prevent them from churning out bias, misinformation, and even blueprints for deadly chemical weapons. Maybe you’ve seen neuroscience studies that interpret MRI scans to identify whether a human brain is entertaining thoughts of a plane, a teddy bear, or a clock tower. “But car does.” Interpreting neural nets by that principle involves a technique called dictionary learning, which allows you to associate a combination of neurons that, when fired in unison, evoke a specific concept, referred to as a feature.

Get the Android app

Or read this on Wired

Read more on:

Photo of Way

Way

Photo of Anthropic

Anthropic

Photo of black box

black box

Related news:

News photo

Quora goes 'all in' on AI in final bid to avoid going the same way as Yahoo Answers

News photo

UK’s autonomous vehicle legislation becomes law, paving the way for first driverless cars by 2026

News photo

A web version of Anthropic's prompt engineering interactive tutorial