Get the latest tech news

Can AI really compete with human data scientists? OpenAI’s new benchmark puts it to the test


OpenAI's new MLE-bench challenges AI systems with real-world data science tasks, revealing both the progress and limitations of AI in machine learning engineering compared to human experts.

The system challenges AI to perform complex machine learning tasks, from model training to submission creation, mimicking the workflow of human data scientists. This move may help establish common standards for evaluating AI progress in machine learning engineering, potentially shaping future development and safety considerations in the field. However, it’s important to note that while the benchmark shows promising results, it also reveals that AI still has a long way to go before it can fully replicate the nuanced decision-making and creativity of experienced data scientists.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of OpenAI

OpenAI

Photo of test

test

Photo of new benchmark

new benchmark

Related news:

News photo

OpenAI says threat actors have used platform attempting to influence US election

News photo

OpenAI fires back at ‘blusterous’ Elon Musk and seeks lawsuit dismissal

News photo

OpenAI says Chinese gang tried to phish its staff