Get the latest tech news

OpenAI quietly funded independent math benchmark before setting record with o3


OpenAI's involvement in funding FrontierMath, a leading AI math benchmark, only came to light when the company announced its record-breaking performance on the test. Now, the benchmark's developer Epoch AI acknowledges they should have been more transparent about the relationship.

"For future collaborations, we will strive to improve transparency wherever possible, ensuring contributors have clearer information about funding sources, data access, and usage purposes at the outset," Besiroglu writes. While this lack of transparency doesn't undermine the benchmark's quality or significance per se, such an important tool for AI evaluation deserved complete openness from the start, especially since mathematical reasoning is a major weakness of language models and improved logical performance could signal a breakthrough. OpenAI had early access to many of the tasks and solutions in the benchmark, and while there was a verbal agreement that this data would not be used for training, the lack of transparency surrounding the arrangement remains problematic.

Get the Android app

Or read this on r/technology

Read more on:

Photo of OpenAI

OpenAI

Photo of record

record

Related news:

News photo

FTC says partnerships like Microsoft-OpenAI raise antitrust concerns

News photo

OpenAI Has Created an AI Model For Longevity Science

News photo

EV, Hybrid Sales Reached Record 20% of US Vehicle Sales In 2024