Get the latest tech news

Bloom Filters

The original motivation for the creation of Bloom filters is efficient set membership, using a probabilistic approach to significantly reduce the time and space required to reject items that are not members in a certain set. The data structure was proposed by Burton Bloom in a 1970 paper titled "Space/Time Trade-offs in Hash Coding with Allowable Errors".

Note that it's trivial to prove (by the law of contraposition) that all "false" answers from a Bloom filter's test operation are true negatives. In other words, our Bloom filter requires about 1.2 GB of space to cache the membership test of a billion items (that could be of arbitrary size). Assuming independence between our hash functions (this is not super rigorous, but a reasonable assumption in practice), let’s calculate the false positive rate.

Get the Android app

Or read this on Hacker News