The memory estimates for the cache states that
If we follow the 80-20 rule, meaning 20% of URLs generate 80% of traffic, we would like to cache these 20% hot URLs.
Since we have 20K requests per second, we will be getting 1.7 billion requests per day:
20K * 3600 seconds * 24 hours = ~1.7 billion
To cache 20% of these requests, we will need 170GB of memory.
0.2 * 1.7 billion * 500 bytes = ~170GB
Why are we calculating the cache estimates based on the read QPS? Should it not be the write QPS?
We cache 20% of the requests, which would lead to
200 QPS meaning 200 * 60 * 60 * 24 = 17 million requests in day.
=> 17,280,000 * 500 bytes = 8.6 TB of memory.
Caching 20% of the data = 1.72 TB of memory.
Also 30 billion * 500 bytes = 15 PB and not 15 TB