Get the latest tech news

Is there a half-life for the success rates of AI agents?

Building on the recent empirical work of Kwa et al. (2025), I show that within their suite of research-engineering tasks the performance of AI agents on longer-duration tasks can be explained by an extremely simple mathematical model — a constant rate of failing during each minute a human would take

They plot the time horizon on a log scale and note that this reveals a sigmoid-shaped decay curve of success rate (the coloured bars). Note that I am not claiming AI agents have a precisely constant rate of failure per minute of time it would take a human to complete the task. If systematic deviations from exponential decay are found, such as the hazard rate increasing (or decreasing) with time, this might provide useful hints as to what the agents are doing wrong (or right).

Get the Android app

Or read this on Hacker News