Get the latest tech news

We need data engineering benchmarks for LLMs

Data Engineering Isn’t Software Engineering. Let’s stop pretending one size fits all ✊

SWE-bench evaluates LLMs on real-world software engineering tasks by using GitHub issue–pull request pairs from popular repositories. DE must handle schema drift, missing values, malformed records, and outliers, which are rarely part of SWE workflows. A DE-bench would provide a structured, objective framework for assessing LLMs on real-world DE tasks, ensuring that these tools are reliable, efficient, and robust.

Get the Android app

Or read this on Hacker News