educative.io

Educative

Virtual nodes in Consistent hashing and Replication?

In consistent hashing, we use virtual nodes to reduce hotspots.
In replication, you also mentioned “To avoid putting replicas on the same physical nodes, the preference list can skip those virtual nodes whose physical node is already in the list”. What exactly is the virtual node in replication?
Are those 2 different “virtual nodes”? I assume they are different, because one is for partationing, one is for replication. Is that correct?


Course: Grokking Modern System Design Interview for Engineers & Managers - Learn Interactively
Lesson: Ensure Scalability and Replication - Grokking Modern System Design Interview for Engineers & Managers

1 Like

Hi @Zihao_Wang, Yes, you are right in the sense that we need to carefully select the virtual nodes for the purposes of distribution of keys well, and for replication. We have multiple virtual nodes for the same actual physical node. Consider a node A, and its virtual nodes are b,c,d. Now while selecting a replica for virtual node c, we will ensure that we do not select b, or d as they are on the same physical node A. And if A goes down, all of its virtual nodes go down and we don’t get the benefit of replication.

So how is a preference list created then? What I mean is suppose a write request is to handled by Node A then how do we determine its preference list?

Without preference list if n = 3 then next two nodes in the clock wise direction will be considered for replication even if they are virtual nodes.

So each node has its own preference list? Like is it predefined if not how is it achieved? And how does the addition/removal of nodes affects it?

Another questions, let’s say our preference list is [A, D, C, B] and node A was supposed to handle the request but it goes down now from the preference list D will handle it but as discussed using hinted handoff it will notify node A when it’s back and delete itself from node D so the data should be copied to n - 1 nodes from A and not from D, right? Whilst it should take account of the fact that data was sent to node D and not A so the replication should not happen at virtual nodes of A and D. Or is it something different?


Course: Grokking Modern System Design Interview for Engineers & Managers - Learn Interactively
Lesson: Ensure Scalability and Replication