7

I'm aggregating product search results by tags, which have a name and ID fields. How do I get both fields back in an aggregation bucket? I can get one or the other, but I can't figure out how to get both. BTW, script access is turned off on my cluster, so I can't use that.

Here's my product mapping (simplified for this question):

"mappings": {
    "products": {
        "properties": { 
            "title": {
                "type": "string"
            },
            "description": {
                "type": "string"
            },
            "topics": {
                "properties": {
                    "id": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "name": {
                        "type" : "string",
                        "index": "not_analyzed"
                    }
                }
            }
        }
    }
}

Here's my query:

"query": {
    "multi_match": {
        "query": "Testing 1 2 3",
        "fields": ["title", "description"]
    },
    "aggs": {
        "Topics": {
            "terms": {
                "field": "topics.id",
                "size": 15
            }
        }
    }
}

My aggregation buckets look like this:

enter image description here

...the "key" value in the first bucket is the topics.id field value. Is there a way to add my topics.name field to the bucket?

1 Answer 1

5

If you want to add another field as a key in your bucket, then (id,name) will act as a unique bucket. You need an association between an id and a name. Without nested mapping, the list of ids and names are separate arrays. Hence, you need to map it as nested.

"topics": {
    "type": "nested",
    "properties": {
        "id": {
            "type": "string",
            "index": "not_analyzed"
        },
        "name": {
            "type": "string",
            "index": "not_analyzed"
        }
    }
}

For aggregation on multiple fields, you need to use sub-aggregations.

Here is a sample aggregation query:

 {
      "aggs": {
        "topics_agg": {
          "nested": {
            "path": "topics"
          },
          "aggs": {
            "name": {
              "terms": {
                "field": "topics.id"
              },
              "aggs": {
                "name": {
                  "terms": {
                    "field": "topics.name"
                  }
                }
              }
            }
          }
        }
      }
    }

Aggregation Sample Result :

    "aggregations": {
          "topics_agg": {
             "doc_count": 5,
             "name": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                   {
                      "id": 123,
                      "doc_count": 6,
                      "name": {
                         "doc_count_error_upper_bound": 0,
                         "sum_other_doc_count": 0,
                         "buckets": [
                            {
                               "key": "topic1",
                               "doc_count": 3
                            },
                            {
                               "key": "topic2",
                               "doc_count": 3
                            }
                         ]
                      }
                   },
                   {
                      "key": 456,
                      "doc_count": 2,
                      "name": {
                         "doc_count_error_upper_bound": 0,
                         "sum_other_doc_count": 0,
                         "buckets": [
                            {
                               "key": "topic1",
                               "doc_count": 2
                            }
                         ]
                      }
                   },
..............

Note : For id : 123, there are multiple names buckets. As there are multiple name values for the same id. To create separate unique buckets, just create all the parent-child combinations.

For eg. 123-topic1, 123-topic2, 456-topic1

Sign up to request clarification or add additional context in comments.

7 Comments

when I do that, the sub-aggregation "name" contains an array of 10 buckets with different keys values.
Yes... fot that same id, there might be 10 different names. Added the explanation Above
But there shouldn't be multiple topic name values for the same topic ID.
In ideal scenario, Yes but that depends on the nature of data. Please share the response that you are getting
Unfortunately I can't format json or post a picture inside comments. Topic IDs are unique, and at index time, product topics is an array of topics [{"id1", "topic 1"},{"id2", "topic 2"}]. The sub-agg you are suggesting to get the "name" field for each specific topic returns what appear to be random words.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.