educative.io

Checking calculation

Can someone confirm if the below calculation looks good?
Assumptions:
DAU = 300 M
50% users use Twitter daily
Requests per day on average = 20
Users post 2 tweets per day on average.
10% of tweets contain media.
Data is stored for 5 years.
Twitter server can handle approximately 6k requests per second (RPS)
Around 50% of tweets contain images and around 5% contain videos.
Assume that a single user views 50 tweets in a day

Estimating the Number of Servers (web):
Total Requests per day = 300 M * 50% * 20 = 3B
Total Requests per sec (t) = 3B / 86400 = 35000 (approx)
Peak hours Requests per sec = 2 * t = 70000
Number of servers required in a single data center = Number of requests per sec / RPS = 70000 / 6000 = 12
Accounting for redundancy (3x Redundancy) = 12 * 3 = 36
Accounting for multiple geographic distribution, number of servers required = 36 * 4 = 144

Query per second (QPS) estimate:
Daily active users (DAU) = 300 million * 50% = 150 million
Tweets QPS = 150 million * 2 tweets / 24 hour / 3600 seconds = ~3500
Peek QPS = 2 * QPS = ~7000

Storage estimate:
Total Requests per day = 300 M * 50% * 2 (tweets per day) = 300 M tweets per day
Storage required per tweet = 150 B
Storage required per image = 250 KB
Storage required per video = 5 MB
Storage for tweets = 300 M * 150B = 45 GB
Storage for images = 300 M * 50% * 250KB = 37.5 TB
Storage for videos = 300 M * 5% * 5MB = 75 TB
Total storage per day = 112.5 TB
5-year media storage = 112.5 TB * 365 * 5 = 205 PB

Bandwidth estimate:
Incoming traffic bandwidth = (112.5TB * 8) / 86400 = 10.5 Gbps
Outgoing traffic:
DAU = 150 million
Daily tweets viewed = 50 per user
Tweets viewed / second = (150M * 50) / 86400 = 87K
Bandwidth required for tweets = 87000 * 150B * 8 = 0.1Gbps
Bandwidth required for images = 87000 * 0.5 * 250KB = 87Gbps
Bandwidth required for videos = 87000 * 0.05 * 5MB = 174Gbps
Total Outgoing bandwidth = (0.1 + 87 + 174) Gbps = 261Gbps
Total bandwidth requirements = 261 + 10.5 = 271 Gbps


Course: https://www.educative.io/collection/10370001/4941429335392256
Lesson: https://www.educative.io/collection/page/10370001/4941429335392256/4766860129599488

Hi Ritesh.

For your estimation of the required servers, kindly see our previous reply to your query at this link.

As for your storage and bandwidth estimations, the formulas and calculations are correct. However, the DAU, as per your assumption, should be 300M. Multiplying DAU with 50% effectively means that, in your estimation, the DAU is 150M.

Remember, you can assume any number for the DAU, but DAU, by definition, means the number of users active on the service daily.

If you have any further questions, feel free to reach out to us.

Thank you.