Get the latest tech news

OpenAI launches program to design new ‘domain-specific’ AI benchmarks

OpenAI, like many AI labs, thinks benchmarks are broken. It says it wants to fix them through a new program.

“Creating domain-specific evals are one way to better reflect real-world use cases, helping teams assess model performance in practical, high-stakes environments.” Through the Pioneers Program, OpenAI hopes to create benchmarks for specific domains like legal, finance, insurance, healthcare, and accounting. “We’re selecting a handful of startups for this initial cohort, each working on high-value, applied use cases where AI can drive real-world impact.”

Get the Android app

Or read this on TechCrunch