Designing Instagram - Grokking the System Design Interview

divine · February 21, 2021, 1:13pm

Please read the text in above image to understand what this question is based on.

" Since each database server can have multiple database instances running on it, we can have separate databases for each logical partition on any server. So whenever we feel that a particular database server has a lot of data, we can migrate some logical partitions from it to another server"

— i’m not able to understand the difference between databaseServer and databaseInstance in the above quoted text. As per my understanding on databaseScaling is , inorder to handle large number of network calls and data storange, we can basically scale by

running multiple database servers (for eg. mysql servers) in a single machine (vertical scaling) or
running multiple database servers (for eg. mysql servers) in separate machines (horizontal scaling)

the quoted text states " Since each database server can have multiple database instances running on it "
---- This is very confusing to me. What is meant by “each database server can have multiple database instance” ?

Please elaborate.
Thank you

Nikhil_Kumar · August 28, 2021, 6:40pm

It’s a little late but I’ll answer from someone who didnt get this same thing.

By database server, the mean, the actual physical machine, lets say a linux server.
Within that server, they will spin up multiple database instances, like multiple mysqls running on different ports.
So, if the database server(or database node, or node) has IP address of “abcd” and instance is running on lets say 1-10, then each mysql can be queried from “abcd:1”… “abcd:10”.

In future, if shard 1 or shard 8 start getting lot of loads, that actually slows downs the whole linux server, then they can move shard 8 and 10 to a different server with IP “xyz” and then they can be queried with new ports.

This strategy is good, since you wont have to move the data within shard, and this is actually important because if dont shard enough and later you realize that your shards are holding 100PB of data then the system needs to be taken down to migrated data. Its better to create so much shard lets say 10k shards (expecting, lets say 10billion user in future) and then put the into 100 nodes and as the load increase keep moving them into new nodes.