Rendezvous or Highest Random Weight (HRW) hashing algorithm

I committed an implementation of HRW on github. It seems to have an advantage over consistent hashing in having an even distribution of data across all the nodes after modifications to the pool.  Consistent hashing mitigates its weakness by placing nodes multiple times into the ring creating many virtual nodes.  HRW tends to be slower with having to run hash functions against each node instead of the just once with the CH algorithm.  HRW is still very fast (ie 100k+/sec if parallelized depending on hash function) with pool sizes smaller than 1000 nodes.  It is possible to achieve O(logN) lookups with HRW, this implementation is much simpler and is O(N) but with N being generally so small its irrelevant.

It makes sense for a lot of things to use consistent hashing instead (providing you use virtual nodes!). To show the importance here is an example of using just a single token per node on the ring below.

Only node4 takes the load of the 2 that were removed. However using HRW the distribution remains even

If using say, 200 vnodes are added per node with CH the distribution looks the same as the HRW distribution above.

Post navigation