Get the latest tech news

MIT researchers advance automated interpretability in AI models


MAIA is a multimodal agent for neural network interpretability tasks developed at MIT CSAIL. It uses a vision-language model as a backbone and equips it with tools for experimenting on other AI systems.

On the other hand, MAIA can generate hypotheses, design experiments to test them, and refine its understanding through iterative analysis,” says Tamar Rott Shaham, an MIT electrical engineering and computer science (EECS) postdoc at CSAIL and co-author on a new paper about the research. The automated agent is demonstrated to tackle three key tasks: It labels individual components inside vision models and describes the visual concepts that activate them, it cleans up image classifiers by removing irrelevant features to make them more robust to new situations, and it hunts for hidden biases in AI systems to help uncover potential fairness issues in their outputs. MAIA helps to bridge this by developing AI agents that can automatically analyze these neurons and report distilled findings back to humans in a digestible way,” says Jacob Steinhardt, assistant professor at the University of California at Berkeley, who wasn’t involved in the research.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of AI models

AI models

Photo of MIT

MIT

Photo of MIT researchers

MIT researchers

Related news:

News photo

Qualcomm makes its AI models available to app developers

News photo

OpenAI used a game to help AI models explain themselves better

News photo

The Morning After: AI models from Apple, NVIDIA and more were reportedly trained on YouTube videos