If we store different messages of a user on separate database shards, fetching a range of messages of a chat would be very slow

Dewey_Munoz · October 20, 2021, 3:41am

Partitioning based on MessageID: If we store different messages of a user on separate database shards, fetching a range of messages of a chat would be very slow, so we should not adopt this scheme.

why is the scatter gather to slow here, but ok in the twitter search solution.

Asma_Yasin · October 22, 2021, 12:33pm

In tweeter, we have sharding based on words and tweeter object. Querying for a particular word, we have to query all the servers. If it is based on words then there will be a lot of queries on multiple server holding that word which will affect the performance of our service. While in TweetID, we’ll use it in to calculate hash function, find the server and index all the words of the tweet on that server. So sharding based on TweetID is better in tweeter.
In messanger, partitioning based on MessageID is a bad approach because if we store different messages of a user on separate database shards, fetching a range of messages of a chat would be very slow. Instead of it partitioning based on UserID is a better approach because we can keep all messages of a user on the same database.
So it is all about the Messanger that why UserID is a useful approach for better performance.

AmSh · October 31, 2021, 6:19am

When you say user Id is it based on sender Id or receiver id.

Asma_Yasin · November 4, 2021, 6:04am

It is receiver Id.