The reason your initial calculation is giving you inaccurate results isn’t in largest part due to (User_RPS)/(Server_RPS) not taking into account that you need multiple servers to handle a request, it’s because you’ve incorrectly estimated the average throughput of a mix of computationally bound and memory bound workloads.
Instead of taking the mean of these RPS’s, the minimum of them. To see why, imagine we got 1000 requests, half of which are memory bound and half of which are CPU bound. The total time to process these requests won’t be (1000/8000) seconds, it’ll be (500/360) seconds, which about 16 times as long.
Instead of fixing this fundamental problem with your calculations, you instead introduced a completely arbitrary formula:
Which makes no mathematical sense unless you assume that there will be a single second in which every user makes a request.
Although the number you get out the other end is clearly more accurate, this formula is completely arbitrary. It should be replaced with something based on a rigorous mathematical foundation (like some estimations for how many servers you need to handle + fan out an individual request).
The quality of the mathematics in this section makes me regret paying for Educative. This isn’t hard stuff, it’s high school level maths.
EDIT: Silly mistake in my original post. The point still stands though.
Course: Grokking Modern System Design Interview for Engineers & Managers - Learn Interactively
Lesson: How the Domain Name System Works - Grokking Modern System Design Interview for Engineers & Managers