Get the latest tech news

Can AI really compete with human data scientists? OpenAI’s new benchmark puts it to the test

OpenAI's new MLE-bench challenges AI systems with real-world data science tasks, revealing both the progress and limitations of AI in machine learning engineering compared to human experts.

The system challenges AI to perform complex machine learning tasks, from model training to submission creation, mimicking the workflow of human data scientists. This move may help establish common standards for evaluating AI progress in machine learning engineering, potentially shaping future development and safety considerations in the field. However, it’s important to note that while the benchmark shows promising results, it also reveals that AI still has a long way to go before it can fully replicate the nuanced decision-making and creativity of experienced data scientists.

Get the Android app

Or read this on Venture Beat