educative.io

A little more details about twitter data sharding pls

@Design_Gurus

in twitter design solution, it says that we could combine TweetID and Tweet creation time as

twitter sharding key.

  1. so we need a external server to generate sharding key for every tweet, right?

    and then that server could be single point of failure, and need to care about concurrency.

    maybe we could talk about that a little more.

  2. we would have a hotspot sitution if we sharding based on UserID. to recover from that,

    we can use consistent hashing. and we would add “virtual replicas” when we need add a

    shard

    and it seems that we are not going to use consistent hashing when we combine sharding by

    TweetID and Tweet creation time. so how can we scal out? what if we want to add a shard?

    i find this blog, and i think it is a good approach. what is your idea?