In the design pastebin case, it says below that cache size should be 20% of the read size. I get the 20-80 rule but why read size? Isn’t write size what you are actually storing? Like you can read one entry 5 times, but you only need to store it once, right?
Memory estimates: We can cache some of the hot pastes that are frequently accessed. Following the 80-20 rule, meaning 20% of hot pastes generate 80% of traffic, we would like to cache these 20% pastes.
Since we have 5M read requests per day, to cache 20% of these requests, we would need:
0.2 * 5M * 10KB ~= 10 GB
Course: Grokking the System Design Interview - Learn Interactively
Lesson: Designing Pastebin - Grokking the System Design Interview