I found a practical trick to improve the bulk index performance by myself.
I can calculate the hash routing in my client and make sure that each bulk request containing all index requests with the same routing. According to the routing result and shard info with ip, I directly send the bulk request to corresponding shard node. This trick can avoid the bulk reroute cost and cut down the bulk request thread pool occupation which may cause EsRejectedException.
For example, I have 48 nodes in different machines. Assuming that I send a bulk request containing 3000 index requests to any node, these index requests will be rerouted to other nodes (usually all the nodes) according by routing. And the client thread has to wait for the whole process finished, including processing local bulk and waiting for other nodes' bulk responses. However, without the reroute phase, the network costs are gone (except for forwarding to the replica nodes), and the client just need to wait less time. Meanwhile, assuming that I have only 1 replica, the total occupation of bulk threads are 2 only. ( client-> primary shard and primary shard -> replica shard )
Routing hash:
shard_num = murmur3_hash (_routing) % num_primary_shards
Try to take a look into: org.elasticsearch.cluster.routing.Murmur3HashFunction
Client can get the shards and index aliases by request to cat apis.
shard info url: cat shards
aliases mapping url: cat aliases
Some attentions:
- ES may change default hash function in different version, which means the client code may not be version compatible.
- This trick is based on the assumption that the hash results are basically balanced.
- Client should think about fault tolerance such as connection timeout to the corredponding shard node.