4

Just wondering why elasticsearch still use that simple routing value approach for deciding which shard the data must be stored to. Actually this approach is limiting us to change the number of shards in the future. If elasticsearch uses an approach like consistent hashing (or even better technique), it can give us a chance to change the shard number in the future. Anyone have explanation or idea about this?

3
  • 1
    It is not possible to change the number of shards after having created an index, so that's a non-issue :-) Commented Sep 15, 2017 at 9:24
  • @Val it isn't possible because elasticsearch uses this routing value, right? Commented Sep 15, 2017 at 9:40
  • Yes, but that's a design choice made by the ES folks and they won't change that in the near future. Commented Sep 15, 2017 at 9:47

1 Answer 1

6

As of Elasticsearch release 6.1.0, index splitting is possible. See release note: https://www.elastic.co/blog/elasticsearch-6-1-0-released.

The Split Index documentation actually explains why Elasticsearch doesn't use Consistent Hashing in more detail.

Consistent hashing only requires 1/N-th of the keys to be relocated when growing the number of shards from N to N+1. However Elasticsearch’s unit of storage, shards, are Lucene indices. Because of their search-oriented data structure, taking a significant portion of a Lucene index, be it only 5% of documents, deleting them and indexing them on another shard typically comes with a much higher cost than with a key-value store.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.