Get the latest tech news
Crawling a billion web pages in just over 24 hours, in 2025
;dr: - 1.005 billion web pages - 25.5 hours - $462 For some reason, nobody's written about what it takes to crawl a big chunk of the web in a while: the last point of reference I saw was Michael Nielsen's post from 2012 Obviously lots of things have changed since then. Most bigger, better, faster: CPUs have gotten a lot more cores, spinning disks have been replaced by NVMe solid state drives with near-RAM I/O bandwidth, network pipe widths have exploded, EC2 has gone from a tasting menu of instance types to a whole rolodex's worth, yada yada.
None
Or read this on Hacker News
