In the article, it mentions:
sharing by UserID:
“What if a user becomes hot? There could be a lot of queries on the server holding the user. This high load will affect the performance of our service.”
Sharding by TweetID:
“This approach solves the problem of hot users, Sharding based on TweetID”
I do not understand why sharding by TweetId solves the problem of hot users. If I understand it correctly, when a user becomes hot, a lot of queries from this user’s followers will query this DB server, but it is still part of the whole traffic. While sharding by TweetID, for all traffic, we need to query all DB servers. Every single DB server has to serve the whole traffic. “Part of the whole traffic” vs “the whole traffic”. So, sharing by TweetID does not reduce the amount of traffic in a single server, it makes things worse. Please correct me If I am wrong
Type your question above this line.