educative.io

Requirements of a Newsfeed System’s Design - why do you count each newsfeed separately in the estimation

example :

Textual post’s storage estimation: All posts could contain some text, we assume it’s 50KB on average. The storage estimation for the top 200 posts for 500 million users would be:
200×500M×50KB=5PB

, so if two users look at the same item in their generated newsfeed - do we keep the 50KB separately for both of them (duplicate data) ?


Course: Grokking Modern System Design Interview for Engineers & Managers - Learn Interactively
Lesson: Requirements of a Newsfeed System’s Design - Grokking Modern System Design Interview for Engineers & Managers

Dear Pavel Kletskov,

I commend your thorough reading of the material. Allow me to provide some clarification.

In the detailed design of the newsfeed system, we utilize a database to store posts published by various users. When the web server receives a request for a post, it follows either of these processes:

  • If the user doesn’t frequently visit the platform, the web server calls the newsfeed generation service to dynamically generate feeds specifically for that user upon their request.
  • For active users who visit the platform regularly, the web server retrieves the pre-generated newsfeed that has already been prepared for them.

We also have addressed the point you raised in the query in the form of a question in this lesson. The question is, “The creation and storage of newsfeeds for each user in the cache require an enormous amount of memory (step 5 in the above section). Is there any way to reduce this memory consumption?”

I hope the answer to the given question will clarify the concept.

Happy learning!