Get the latest tech news

OpenZFS deduplication is good now and you shouldn't use it


OpenZFS 2.3.0 will be released any day now, and it includes the new “Fast Dedup” feature. My team at Klara spent many months in 2023 and 2024 working on it, and we reckon it’s pretty good, a huge step up from the old dedup as well as being a solid base for further improvements.

If the refcount is non-zero, the IO is returned as “completed”, but if it reaches zero, then the last “copy” of the block is being freed, so the dedup table entry is deleted and the metaslab allocator is called to deallocate the space. It does make the “lookup entry” part of the IO pipeline more complex of course, and it does introduce some subtleties when freeing the block when the refcount reaches zero, but that’s fine - clever and tricky things are ok, for a good cause. Unless you have a very very specific workload where data is heavily duplicated and clients can’t or won’t give direct “copy me!” signal, then just using block cloning is likely to get you a good chunk of the gain without the outsized amount of pain.

Get the Android app

Or read this on Hacker News