Get the latest tech news

Evaluating GPT5's reasoning ability using the Only Connect game show


We tested the latest GPT-5 models against reasoning benchmarks using Only Connect challenges to measure pattern recognition, lateral thinking, and multi-step inference capabilities beyond traditional knowledge-based tests.

For rounds requiring straight guesses, we provided LLMs with all available clues and used structured output parameters to receive JSON responses. We'll publish the complete dataset this week alongside a granular analysis identifying which questions posed the greatest challenges for models. We'll also implement a more realistic competitive format, pairing models against each other and allowing points for correctly answering questions opponents miss.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of reasoning ability

reasoning ability

Photo of Evaluating GPT5

Evaluating GPT5

Photo of Connect game show

Connect game show