0

Is it possible to create an aggregation by unnesting an array's elements to use as keys?

Here's an example:

Docs:

[
  {
    "languages": [ 1, 2 ],
    "value": 100
  },
  {
    "languages": [ 1 ],
    "value": 50
  }
]

its mapping:

{
    "documents": {
        "mappings": {
            "properties": {
                "languages": {
                    "type": "integer"
                },
                "value": {
                    "type": "integer"
                }
            }
        }
    }
}

and the expected output of a summing aggregation would be:

{
  1: 150,
  2: 100
}
4
  • Can you share the mapping of your index? Commented Jul 22, 2020 at 3:45
  • @Val updated with mapping Commented Jul 22, 2020 at 3:59
  • languages with type integer even though you have fr and en as values. How is this possible? Commented Jul 22, 2020 at 4:00
  • @Val updated with correct types, corrected example Commented Jul 22, 2020 at 4:03

2 Answers 2

1

You can achieve what you want by using a simple terms aggregation. Array elements will be bucketed individually:

POST index/_search
{
  "aggs": {
    "languages": {
      "terms": {
        "field": "languages"
      },
      "aggs": {
        "total": {
          "sum": {
            "field": "value"
          }
        }
      }
    }
  }
}

Results:

  "aggregations" : {
    "languages" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1,
          "doc_count" : 2,
          "total" : {
            "value" : 150.0
          }
        },
        {
          "key" : 2,
          "doc_count" : 1,
          "total" : {
            "value" : 100.0
          }
        }
      ]
    }
  }
Sign up to request clarification or add additional context in comments.

Comments

0

The terms agg will sum up the # of occurences. What you want instead is a script to sum up the values based on the language array items as keys:

GET langs/_search
{
  "size": 0,
  "aggs": {
    "lang_sums": {
      "scripted_metric": {
        "init_script": "state.lang_sums=[:]",
        "map_script": """
          for (def lang : doc['languages']) {
            def lang_str = lang.toString();
            def value = doc['value'].value;

            if (state.lang_sums.containsKey(lang_str)) {
              state.lang_sums[lang_str] += value;
            } else {
              state.lang_sums[lang_str] = value; 
            }
          }
        """,
        "combine_script": "return state",
        "reduce_script": "return states"
      }
    }
  }
}

yielding

{
  ...
  "aggregations":{
    "lang_sums":{
      "value":[
        {
          "lang_sums":{
            "1":150,
            "2":100
          }
        }
      ]
    }
  }
}

2 Comments

The sum sub-aggregation does exactly that ;-)
I admit I didn't pay attention to the values, only the keys. My bad

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.