0

I have some documents stored in Elasticsearch like the following:

{
          "date" : 1,
          "field1" : 0.2,
          "field2" : 0.5,
          "field3" : 0.3
},
{
          "date" : 1,
          "field1" : 0.9,
          "field2" : 0.5,
          "field3" : 0.1
},
{
          "date" : 2,
          "field1" : 0.2,
          "field2" : 0.6,
          "field3" : 0.7
}

and what I'd like to get is a count for how many times each of field1, field2, or field3 were the greatest for each document, grouped by date, ie. expecting result to be something like:

{
          "date" : 1,
          "field1-greatest" : 1,
          "field2-greatest" : 1,
          "field3-greatest" : 0
},
{
          "date" : 2,
          "field1-greatest" : 0,
          "field2-greatest" : 0,
          "field3-greatest" : 1
}

I'm using a terms aggregation on date, but not sure how to compare different fields to do this max and count type operation using Elasticsearch aggregations. Any suggestions?

1 Answer 1

0

You could go with the following:

{
  "size": 0,
  "aggs": {
    "by_date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "field1_greatest": {
          "max": {
            "field": "field1"
          }
        },
        "field2_greatest": {
          "max": {
            "field": "field2"
          }
        },
        "field3_greatest": {
          "max": {
            "field": "field3"
          }
        }
      }
    }
  }
}

Tip: make sure your field* properties are mapped as type double, not float because the max agg for field1, for example, may yield 0.8999999761581421 instead of 0.9.


CORRECTION

It's a non-trivial use case so you'll probably need to use a script. Here's something to get you started:

{
  "size": 0,
  "aggs": {
    "by_date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "by_greatest": {
          "scripted_metric": {
            "init_script": """
              state.field1_greatest = 0;
              state.field2_greatest = 0;
              state.field3_greatest = 0;
            """,
            "map_script": """
              def v1 = doc['field1'].value;
              def v2 = doc['field2'].value;
              def v3 = doc['field3'].value;
              
              // your comparison logic
            """,
            "combine_script": "state",
            "reduce_script": "states"
          }
        }
      }
    }
  }
}
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your response Joe, but this is not quite what I'm looking for. I don't need the actual max value of each of the fields, but instead the number of times field* is greater than each of the other fields as a count.
My bad, I misread. Check my updated answer -- I've written a boilerplate to get you started.
Thanks, was able to adapt the template above and get it working.
Great! Glad it helped.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.