Get the latest tech news
Extending That XOR Trick to Billions of Rows
Learn how to extend the classic XOR trick to find thousands of missing values using Invertible Bloom Filters
While we can't yet recover the full symmetric difference, we can detect when the XOR result is unreliable by checking if the hash accumulators behave consistently. IBFs use a Bloom filter-style hashing scheme to assign elements to multiple partitions, then employ a graph algorithm called "peeling" to iteratively recover the symmetric difference with very high probability [3]. Instead, it uses a hash function that calculates the cell index as a checksum and has some additional checks in the decoding process to deal with potential "anomalies" that show up in this approach.
Or read this on Hacker News