Metadata service# Why is the load balancer needed between front end and meta data service

Aparna_Rajan · February 19, 2024, 3:31am

If the metadata that needs to be stored is small and can reside on a single machine, then it’s replicated on each cluster server. Subsequently, the request can be served from any random server. In this approach, a load balancer can also be introduced between the front-end servers and metadata services.

Does the above para mean that the metadatastore can be replicated? It sounds like the data store is a separate component from metadata service. When the datastore is replicated to multiple nodes, service can reach out to any of these nodes. Here the load is between metadata service and the datastore servers. But what is the point of a load balancer between front end and metadata service? They make sense only when you have multiple instances of metadata services running. But we are talking about the multiple instances of data stores here

Ali_Hassan · March 5, 2024, 4:58am

Hi Aparna,

There is no such thing as datastore servers. We have data stores such as databases to store the data. Metadata services eventually access these data stores to perform metadata-related CRUD operations.

Now, there can be millions of metadata-related requests, and to handle those requests, we have introduced multiple metadata services. To balance loads on those multiple metadata services, we have introduced load balancers between front-end and metadata servers to avoid over-burdening a single server instance.

We hope that clears up the confusion. Feel free to reach out to us if you have any further questions.

Thank you,