Get the latest tech news

GDPVal: Measuring the performance of our models on real-world tasks


None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Models

Models

Photo of Performance

Performance

Photo of world tasks

world tasks

Related news:

News photo

OpenAI tested GPT-5, Claude, and Gemini on real-world tasks - the results were surprising

News photo

Slow Fire TV? 10 settings I changed to dramatically improve the performance

News photo

Own a Samsung TV? I changed these 6 settings to instantly improve the performance