- The APIs are accessible through a cluster, so the rate limit should be considered across different servers. The user should get an error message whenever the defined threshold is crossed within a single server or across a combination of servers.
does it make sense to get an error message whenever the defined threshold is crossed within a single server? shouldn’t it always be by cluster? if we are unlucky are all requests go to a single server by chance, because the load balancing was not evenly distributed, should we still have a rate limit error?