Get the latest tech news
We need data engineering benchmarks for LLMs
Data Engineering Isn’t Software Engineering. Let’s stop pretending one size fits all ✊
SWE-bench evaluates LLMs on real-world software engineering tasks by using GitHub issue–pull request pairs from popular repositories. DE must handle schema drift, missing values, malformed records, and outliers, which are rarely part of SWE workflows. A DE-bench would provide a structured, objective framework for assessing LLMs on real-world DE tasks, ensuring that these tools are reliable, efficient, and robust.
Or read this on Hacker News