educative.io

How does Consistent Hashing help overloaded partitions?

  1. When a server is added or removed, data needs to be redistributed. Why? Is it because our distribution would not be uniform? Is that why we want to do consistent hashing?

  2. Let’s say we are remapping 1/N (N is the total number of partitions) of keys when we introduce a new server or remove a server. What does this mean exactly? Are we assuming that if for example, partition 10 fails, we would not be able to access those records in partition 10?

  3. If we have a virtual replica of partition 10, will we be able to access the records? Or when partition 10 fails, all virtual replicas fail as well? And then in that case, we just redistribute the data to another partition?

We can remove machines to reduce capacity, or some machines might crash on their own. We can decide to add servers to increase capacity. And yes, it can also be because our distribution is not uniform. If that is the case, you can add a new server© or a virtual server in between two nodes(A and B) in the consistent hashing ring. This way, some of the keys that were hashing to B will now map to C. So, we want to do consistent hashing for multiple reasons - scaling flexibility, robustness and reducing hotspots.

Let’s say we are remapping 1/N (N is the total number of partitions) of keys when we introduce a new server or remove a server. What does this mean exactly? Are we assuming that if for example, partition 10 fails, we would not be able to access those records in partition 10?

If we have a single replica for partition 10s keys, then yes, we lose access to the records in partition 10. We can have virtual servers mapped to the same keys for replication.

  • If we have a virtual replica of partition 10, will we be able to access the records? Or when partition 10 fails, all virtual replicas fail as well? And then in that case, we just redistribute the data to another partition?

This is a tough question. It really depends on how you have designed your system(CAP theorem) and what are you read and write guarantees (write acknowledgements (majority, primary, all etc), read guarantee(always read latest, stale value is ok) etc .