"index would be like a big distributed hash table, where ‘key’ would be the word and ‘value’ will be a list of TweetIDs of all those tweets which contain that word. "
When sharding based on word. I don’t understand "While building our index, we will iterate through all the words of a tweet and calculate the hash of each word to find the server where it would be indexed. "
@Design_Gurus Do you mean every each word in a tweet, we take the word as key and tweet id as value, then for every tweet, we have many keys and every key has same tweet id, and after go through all tweets, we combine all words and their values together?
Can you tell more details about Sharding based on Words and also Sharding based on the tweet object: because other learner found this confusing too?