educative.io

Why is the Daily Active Users number used for server-count estimation?

We need to handle concurrent requests coming from 500 million daily active users. Let’s assume that a typical YouTube server handles 8,000 requests per second.

Can you please explain why 8,000 requests per second is relevant here? Are we assuming that each user will make one request every second all day? Or that the 500M is the peak active users per second and they’re making at max one request per second?

Am I missing something, or should we be trying to calculate how many requests we’ll get per second before comparing this?

2 Likes

we assumed that all 500 M user requests arrive simultaneously. It is because we want to prepare for the worst case

I faced same question when started looking at YT design.
The formula itself looks confusing (divide number per day by number per second).
If I was an interviewer I would immediately stop here and start checking if a person really understands what they is doing.
RPS value also is quite questionable.

I never worked in a company with such workload as YT so I might be wrong here but It would be great if this specific part is revised.


Course: Grokking Modern System Design Interview for Engineers & Managers - Learn Interactively
Lesson: Requirements of YouTube's Design - Grokking Modern System Design Interview for Engineers & Managers

This is very different from what was presented on the other page (“Examples of Resource Estimation” - https://www.educative.io/courses/grokking-modern-system-design-interview-for-engineers-managers/g20Mpw6go5k). In this case the math is “number of requests/sec / RPS of server.”

The result is vastly different. DAU / RPS of server VS. Total User RPS / RPS of each server.
The first method would give 500M / 8k = 60000 servers
The second method would return 115k / 8k = 15 servers

The first method makes no sense because the units doesn’t match. How do you divide up DAU with 8k RPS? In order for that to make any sense, we have to assume each user would be making 1 RPS all at the same time simultaneously.

15 servers sound more reasonable on average, with peak numbers 2x or so more.


Course: Grokking Modern System Design Interview for Engineers & Managers - Learn Interactively
Lesson: https://www.educative.io/courses/grokking-modern-system-design-interview-for-engineers-managers/3jRLwVpplW9

Hi Tanner.

The number 8000 used is the requests per second a server can handle. We are using the equation 500 M / 8000 instead of 115 k / 8000 as we are calculating for the worst case scenario where we are expecting all our DAU making a request simultaneously resulting in 500 M requests for that particular second. Hence the requests per second become 500 M. We divide it by the RPS of a server to get the number of servers. Please fell free to reach out in case of any more queries.

Thank you