Get the latest tech news

Anthropic Looks To Fund a New, More Comprehensive Generation of AI Benchmarks

AI firm Anthropic launched a funding program Monday to develop new benchmarks for evaluating AI models, including its chatbot Claude. The initiative will pay third-party organizations to create metrics for assessing advanced AI capabilities. Anthropic aims to "elevate the entire field of AI safety" ...

The initiative will pay third-party organizations to create metrics for assessing advanced AI capabilities. The most commonly cited benchmarks for AI today do a poor job of capturing how the average person actually uses the systems being tested. The very-high-level, harder-than-it-sounds solution Anthropic is proposing is creating challenging benchmarks with a focus on AI security and societal implications via new tools, infrastructure and methods.

Get the Android app

Or read this on Slashdot