Get the latest tech news

This Tool Probes Frontier AI Models for Lapses in Intelligence

A new platform from data training company Scale AI will let artificial intelligence developers find their models’ weak spots.

In recent months, Scale has contributed to the development of several new benchmarks designed to push AI models to become smarter, and to more carefully scrutinize how they might misbehave. The company says its new tool offers a more comprehensive picture by combining many different benchmarks and can be used to devise custom tests of a model’s abilities, like probing its reasoning in different languages. In February, the US National Institute of Standards and Technologies announced that Scale would help it develop methodologies for testing models to ensure they are safe and trustworthy.

Get the Android app

Or read this on Wired