What parameter should be used for partitioning key in case if we choose document partitioning so that we can equally distribute the load to across nodes ?
Hi Deepchand,
The parameter to be used as a partitioning key can vary depending on the different scenarios, as follows:
-
Document ID: If the document ID provides a high cardinality and even distribution.
-
Timestamp: For time-series data where the query pattern typically involves a date range.
-
User ID or Tenant ID: This can be useful in multi-tenant systems where data locality for each tenant is essential, but ensure this doesn’t lead to uneven distribution.
-
Geographic Location: For location-based services, partitioning based on location or region can be beneficial for localized queries.
-
The hash of a Field: A hash function can be applied to a certain field that results in a high cardinality to provide an even distribution. This is common when no single field naturally provides a good partition key.
When a suitable primary partitioning key is hard to determine, a composite key made from multiple fields can be considered. This can help to improve distribution and align with complex query patterns
Thank you.