Elasticsearch - Aggregation on multiple fields in the same nested scope

Question

I'm aggregating product search results by tags, which have a name and ID fields. How do I get both fields back in an aggregation bucket? I can get one or the other, but I can't figure out how to get both. BTW, script access is turned off on my cluster, so I can't use that.

Here's my product mapping (simplified for this question):

"mappings": {
    "products": {
        "properties": { 
            "title": {
                "type": "string"
            },
            "description": {
                "type": "string"
            },
            "topics": {
                "properties": {
                    "id": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "name": {
                        "type" : "string",
                        "index": "not_analyzed"
                    }
                }
            }
        }
    }
}

Here's my query:

"query": {
    "multi_match": {
        "query": "Testing 1 2 3",
        "fields": ["title", "description"]
    },
    "aggs": {
        "Topics": {
            "terms": {
                "field": "topics.id",
                "size": 15
            }
        }
    }
}

My aggregation buckets look like this:

...the "key" value in the first bucket is the topics.id field value. Is there a way to add my topics.name field to the bucket?

Rhumborl · Accepted Answer · 2021-09-27 15:30:39Z

5

If you want to add another field as a key in your bucket, then (id,name) will act as a unique bucket. You need an association between an id and a name. Without nested mapping, the list of ids and names are separate arrays. Hence, you need to map it as nested.

"topics": {
    "type": "nested",
    "properties": {
        "id": {
            "type": "string",
            "index": "not_analyzed"
        },
        "name": {
            "type": "string",
            "index": "not_analyzed"
        }
    }
}

For aggregation on multiple fields, you need to use sub-aggregations.

Here is a sample aggregation query:

 {
      "aggs": {
        "topics_agg": {
          "nested": {
            "path": "topics"
          },
          "aggs": {
            "name": {
              "terms": {
                "field": "topics.id"
              },
              "aggs": {
                "name": {
                  "terms": {
                    "field": "topics.name"
                  }
                }
              }
            }
          }
        }
      }
    }

Aggregation Sample Result :

    "aggregations": {
          "topics_agg": {
             "doc_count": 5,
             "name": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                   {
                      "id": 123,
                      "doc_count": 6,
                      "name": {
                         "doc_count_error_upper_bound": 0,
                         "sum_other_doc_count": 0,
                         "buckets": [
                            {
                               "key": "topic1",
                               "doc_count": 3
                            },
                            {
                               "key": "topic2",
                               "doc_count": 3
                            }
                         ]
                      }
                   },
                   {
                      "key": 456,
                      "doc_count": 2,
                      "name": {
                         "doc_count_error_upper_bound": 0,
                         "sum_other_doc_count": 0,
                         "buckets": [
                            {
                               "key": "topic1",
                               "doc_count": 2
                            }
                         ]
                      }
                   },
..............

Note : For id : 123, there are multiple names buckets. As there are multiple name values for the same id. To create separate unique buckets, just create all the parent-child combinations.

For eg. 123-topic1, 123-topic2, 456-topic1

edited Sep 27, 2021 at 15:30

Rhumborl

16.7k4 gold badges41 silver badges47 bronze badges

answered Mar 17, 2016 at 3:08

Rahul

16.4k4 gold badges44 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Redtopia Over a year ago

when I do that, the sub-aggregation "name" contains an array of 10 buckets with different keys values.

Rahul Over a year ago

Yes... fot that same id, there might be 10 different names. Added the explanation Above

Redtopia Over a year ago

But there shouldn't be multiple topic name values for the same topic ID.

Rahul Over a year ago

In ideal scenario, Yes but that depends on the nature of data. Please share the response that you are getting

Redtopia Over a year ago

Unfortunately I can't format json or post a picture inside comments. Topic IDs are unique, and at index time, product topics is an array of topics [{"id1", "topic 1"},{"id2", "topic 2"}]. The sub-agg you are suggesting to get the "name" field for each specific topic returns what appear to be random words.

|

Collectives™ on Stack Overflow

Elasticsearch - Aggregation on multiple fields in the same nested scope

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related