Get the latest tech news

This website lets you blind-test GPT-5 vs. GPT-4o—and the results may surprise you


Take this blind test to discover whether you truly prefer OpenAI's GPT-5 or the older GPT-4o—without knowing which model you're using.

It achieves 94.6% accuracy on the AIME 2025 mathematics test compared to GPT-4o’s 71%, scores 74.9% on real-world coding benchmarks versus 30.8% for its predecessor, and demonstrates dramatically reduced hallucination rates—80% fewer factual errors when using its reasoning mode. In response to the backlash, OpenAI announced it would make GPT-5 “warmer and friendlier,” while simultaneously introducing four new preset personalities — Cynic, Robot, Listener, and Nerd — designed to give users more control over their AI interactions. OpenAI has made the model “warmer” in response to feedback, but the company faces a delicate balance: too much personality risks the sycophancy problems that plagued GPT-4o, while too little alienates users who had formed genuine attachments to their AI companions.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of website

website

Photo of results

results

Photo of GPT-5

GPT-5

Related news:

News photo

I retested GPT-5's coding skills using OpenAI's guidance - and now I trust it even less

News photo

A German ISP changed their DNS to block my website

News photo

MCP-Universe benchmark shows GPT-5 fails more than half of real-world orchestration tasks