Can we sync the updates when the failed server is up?

Yihong · December 29, 2020, 2:07pm

In the “CAP Theorem”:

If we pick Consistency , in that scenario, we have to lock down all the nodes for further writes until the nodes which have gone down come back online. This would ensure the Strong consistency of the system as all the nodes will have the same entity values.

I am thinking about an alternative way. Do not lock all nodes when one node goes down. Instead, when the failed node is recovering, sync the updates from the other nodes since its failure, and only lock all nodes during the recovering. Is this possible?

Thanks

Shivang · February 13, 2022, 3:38pm

@Yihong what you are saying largely depends on how the distributed system is designed. Also, imagine, a few nodes going down, having the latest data, as the failed node is recovering and it does not have the latest data yet? What are we going to do in this scenario? Designing and managing distributed systems is tricky. Systems like these take years of testing and tweaks to perfect.