If we follow the 80-20 rule, meaning 20% of URLs generate 80% of traffic, we would like to cache these 20% hot URLs
how do we decide this 20% factor ?
Isn’t it altogether different problem ? or does this feature comes in-built with caches ?
If we follow the 80-20 rule, meaning 20% of URLs generate 80% of traffic, we would like to cache these 20% hot URLs
how do we decide this 20% factor ?
Isn’t it altogether different problem ? or does this feature comes in-built with caches ?
Hi @Ayush_Chaubey,
80-20 rule is a general rule for estimation. From wikipedia, this is how it is defined:
The Pareto principle (also known as the 80/20 rule, or the law of the vital few, ) states that, for many events, roughly 80% of the effects come from 20% of the causes.
Here we are trying to “estimate” how much URLs we should try to cache. You can read more about this rule here: https://en.wikipedia.org/wiki/Pareto_principle
This estimation is a good start - if we have enough resources. We can increase/decrease the cache size based on the traffic pattern of the service. But that would require more measurements.
Hope this answers your question.
Hey, thanks for reply.
I got the point about ParetoPrinciple. Thanks for sharing this.
I am also interested in knowing
“How does a cache maintains this 80:20 rule” ?
i.e. How does a cache system knows that this URL (20% part) going to generate the 80% of the traffic. This all depends on usage patterns.
So does this kind of feature is in-built with caches ? or do we need to implement something for this ?