Get the latest tech news

AI benchmarks are a bad joke – and LLM makers are the ones laughing


None

Get the Android app

Or read this on The Register

Read more on:

Photo of ones

ones

Photo of LLM

LLM

Photo of AI benchmarks

AI benchmarks

Related news:

News photo

Agent-o-rama: build, trace, evaluate, and monitor LLM agents in Java or Clojure

News photo

Drawer full of USB cables? This tiny tester tells you which ones actually work as advertised

News photo

Show HN: Why write code if the LLM can just do the thing? (web app experiment)