I'm using custom routing on elasticsearch 8.7.1, and it seems like the routing_partition_size is not being implemented as expected unless I manually specify number_of_routing_shards.
For example, if I do not specify the number_of_routing_shards and use the following settings:
{ "index": { "number_of_shards": "128", "routing_partition_size": "5", "number_of_replicas": "1" } }
And then I migrate 1 Million docs with the _routing field, the number of shards that actually end up being used is only 1.
Now if I add the number_of_routing_shards = number_of_shards like this:
{ "index": { "number_of_shards": "128", "number_of_routing_shards": "128", "routing_partition_size": "5", "number_of_replicas": "1" } }
Then the right number of shards (5) as specified in the routing_partition_size ends up being used by the routing key on search. Similarly, if I set number_routing_shards = 2*number_of_shards, then the shards used per routing key come out to be routing_partition_size/2.
It looks like there is a connection between number_of_routing_shards and number_of_shards, but I could not find any information online for why this is the case.
I know that the routing formula (when custom routing is used) is as follows: routing_value = hash(_routing) + hash(_id) % routing_partition_size shard_num = (routing_value % num_routing_shards) / routing_factor
So why is this the case?
Tried many different configurations, read elasticsearch documentation