Get the latest tech news
Faster Index I/O with NVMe SSDs
The Marginalia Search index has been partially rewritten to perform much better, using new data structures designed to make better use of modern hardware. This post will cover the new design, and will also touch upon some of the unexpected and unintuitive performance characteristics of NVMe SSDs when it comes to read sizes. The index is already fairly large, but can sometimes feel smaller than it is, and paradoxically, query performance is a big part of why.
A large part of the problem was that the B-trees were designed around implicit pointers, which meant that all nodes needed to be block aligned, which led to considerable dead air in the data structure, as few document lists were cooperative enough to be a neat multiple of 256 items in length. Because the benchmark has a fairly short iteration time of about a quarter of a second, with a wind-up period, followed by an intense burst of activity, and then a wind-down and reset phase, tools like iostat are likely to under-report the actual disk usage despite showing considerable queue depth, utilization and IOPS. Admittedly the shotgun approach employed in the positions reads is not quite how you are supposed to use it, but for the constraints of the application seems to produce the best balance of throughput and latency, in part because the feast-to-famine cycles introduces breathing room for the drive to process other commands as well.
Or read this on Hacker News