Get the latest tech news

Failing to Understand the Exponential, Again

Posts and writings by Julian Schrittwieser

Long after the timing and scale of the coming global pandemic was obvious from extrapolating the exponential trends, politicians, journalists and most public commentators kept treating it as a remote possibility or a localized phenomenon. We can observe a clear exponential trend, with Sonnet 3.7 achieving the best performance by completing tasks up to an hour in length at 50% success rate. Fortunately for us, OpenAI also included other models in the evaluation, and we can see that Claude Opus 4.1 (released earlier than GPT-5) performs significantly better - ahead of the trend from the previous graph, and already almost matching industry expert (!)

Get the Android app

Or read this on Hacker News