educative.io

Sharding by timestamp

Hi,

Regarding 7. Data Sharding - could you please clarify why combination of epoch seconds and autoincrementing value is recommended solution?

If I understand well, this would result in writing into the same shard over and over again, which does not seem like a scalable approach - all but one shard would only be used at a time and writes will be limited by IO on single host. I’ve been looking around internet to find out more about using timestamp and key and it seems to be discouraged:
https://cloud.google.com/spanner/docs/whitepapers/optimizing-schema-design#anti-pattern_timestamp_ordering

Hi Jakub,

Thanks for reaching out to us. We’ll get back to you with the answer shortly.

Regards,

Educative Team

No it wont be like that, load distribution will not be solely based on tweetid, load distribution will be based on hash(tweetid)

Educative Team,

When are you really responding? Looks like you are a bot.

1 Like

Hi @Jakub,

The sharding will be based on the hash(tweetId) and not on the creation time, so we should expect the uniform load distribution.

Sharding by tweet creation time has all the problems you have mentioned. To handle these problems, we proposed the “epoch seconds” approach. Although in this approach, we still have to query all the servers, but it is efficient due the reasons mentioned at the end:

In the above approach, we still have to query all the servers for timeline generation, but our reads (and writes) will be substantially quicker.

  1. Since we don’t have any secondary index (on creation time) this will reduce our write latency.
  2. While reading, we don’t need to filter on creation-time as our primary key has epoch time included in it.

Hope this clarifies your question.

3 Likes

“Since we don’t have any secondary index (on creation time) this will reduce our write latency.”

How will lack of secondry index reduce the write latency?

Because you don’t have to update a secondary index when write.

3 Likes

I agree on that, and just to add to this point hash(1sec+ 1seq num) != hash(1sec + 2 seq num) which makes distribution balance. Not to mention we can always use consistent hashing to make thing more efficient.

Does anyone can show the SQL query look like?
How to query the ID which is actually compose by epoch time and ID?

3 Likes

@Design_Gurus We will still have to merge the lists of tweet ids. Is the assumption that the merge in O(N) time will be faster than sorting on creation_time O(NLOGN)?

@Design_Gurus
Why cant we use Epoctime+UserID, benefits:- each user tweets on same shard.
and then use consistent hashing to balance load.Wouldnt it a much better solution?

@Design_Gurus
Would you like to answer this?

Why can’t we use Epoctime+UserID, benefits:- each user tweets on same shard.

Would you mind to share more detail about your approach?
Let’s said, if you use consistent hash over Epochtime+UserID, the hash result will bring us to a totally different shards even if we share the same UserID.

e.g. assume UserID: 65535 post a tweets on 1387263000.

md5(1387263000 << 16 | 65535)

must be much far from

md5(1387263001 << 16 | 65535) #with only 1 tick difference

Or maybe you just meant to make tweetId = <epoch + user_id> and sharded by hash(user_id), right?

What I understood from this, having timestamp in id doesn’t have any better performance than shading based on random tweet id.
The only profit you get here is, since you want latest tweet from user, why to create new column why dont use the same tweetid primary key for sorting. So hit all DB for the user and get top 100 result order by tweet id and aggregate the result in the end.

When you say “hit all DB for the user”, could you clarify what tables you are querying?

My understanding is that for SQL, you would query
SELECT * FROM Tweets WHERE USERID=‘userId’ LIMIT 100 ORDER BY TweetId DESC
and this would save time because we are just ordering by primary key?

And for NoSQL it would be querying the UserTweets table (user => [tweets]) and then ordering by TweetId in ascending order

i think maybe that is the point:

  1. for write performance, we should not create a secondary index on creation time

  2. we would not create a index(secondary index) on creation time, that means filter on a non-indexed field would be extremely slow

It would be great if @Design_Gurus could clarify if this is the motivation for suggesting this solution.