Aggregation distinct values in ElasticSearch

Question

I'm trying to get the distinct values and their amount in ElasticSearch.

This can be done via:

"distinct_publisher": {
        "terms": {
            "field": "publisher", "size": 0
        }
    }

The problem I've is that it counts the terms, but if there are values in publishers separated via a space e.g.: "Chicken Dog" and 5 documents have this value in the publisher field, then I get 5 for Chicken and 5 for Dog:

"buckets" : [
            {
                "key" : "chicken",
                "doc_count" : 5
            },
            {
                "key" : "dog",
                "doc_count" : 5
            },
            ...
        ]

But I want to get as the result:

"buckets" : [
            {
                "key" : "Chicken Dog",
                "doc_count" : 5
            }
        ]

danpaz · Accepted Answer · 2016-02-25 21:55:16Z

5

The reason you're getting 5 buckets for each of chicken and dog is because your documents were analyzed at the time that you indexed them.

This means elasticsearch did some small processing to turn Chicken Dog into chicken and dog (lowercase, and tokenize on space). You can see how elasticsearch will analyze a given piece of text into searchable tokens by using the Analyze API, for example:

curl -XGET 'localhost:9200/_analyze?&text=Chicken+Dog'

In order to aggregate over the "raw" distinct values, you need to utilize the not_analyzed mapping so elasticsearch doesn't do its usual processing. This reference may help. You may need to reindex your data to apply the not_analyzed mapping to get the result you want.

answered Feb 25, 2016 at 21:55

danpaz

1656 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

occurred Over a year ago

Thanks a lot! This was absolutely what I was looking for and also a detailed and very good answer.

Collectives™ on Stack Overflow

Aggregation distinct values in ElasticSearch

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related