Get the latest tech news

Overflow in consistent hashing (2018)


Ryan Marcus, assistant professor at the University of Pennsylvania (Fall '23). Using machine learning to build the next generation of data systems.

Consistent hashing was first proposed in 1997 by David Karger et al., and is used today in many large-scale data management systems, including (for example) Apache Cassandra. When our bin capacity is large, and our load factor is not too close to one, the tail of the distribution will collapse rapidly, so we can approximate this sum by taking the first few terms (and applying some algebra): Compare this to the blue curve, representing 200 bins – if I have the same node capacity (20) and the same load factor of 0.8 (implying that I have 3200 items), my overflow probability is practically one.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of overflow

overflow

Photo of Consistent Hashing

Consistent Hashing

Related news:

News photo

Critical vulnerability affecting most Linux distros allows for bootkits | Buffer overflow in bootloader shim allows attackers to run code each time devices boot up