Multi DateHistogram aggregation on elasticsearch Java API

Question

I'm using dateHistogram aggregation with ElasticSearch Java API, and it works pretty well for simple aggregations, such as the number of hits per hour/day/month/year (imagine a series of documents, where the date histogram aggregation is made on 'indexed_date' field).

But, can I, with a single query, make a multi-field date aggregation, in relation to another field? Something like what Kibana does for charts.

An example of what I would like to achieve:

I have a series of documents, where each one is an "event", which has its timestamp. These documents have a series of fields, like "status", "version", etc.

Can I get an aggregation, based on date histogram, on timestamp field and on all values of another field?

Example result of aggregation with a one hour interval:

H: 12 status - { ACTIVE: 34 PAUSED: 12 }

H: 13 status - { ACTIVE: 10 }

EDIT:

Some sample data:

"doc1" - { timestamp: "2014-12-23 12:01", status: "ACTIVE", version: 1 }
"doc2" - { timestamp: "2014-12-23 12.15", status: "PAUSED", version: 1 }
"doc3" - { timestamp: "2014-12-23 13.55", status: "ACTIVE", version: 2 }
(and so on..)

Just to confirm what you're looking for - you want to have hourly buckets (date histogram) and each bucket contains a count of something? e.g. a count of fields with "active": true, or "paused": true ? if you could add some data to the question it would be easier to figure it out. — Olly Cruickshank
– Olly Cruickshank, Commented Dec 23, 2014 at 16:18
Yes, this is what I'm looking for. I'm editing the question to add a bit more data samples. — Carmine Giangregorio
– Carmine Giangregorio, Commented Dec 23, 2014 at 16:33

Olly Cruickshank · Accepted Answer · 2014-12-23 17:19:17Z

4

I would do a term aggregation inside the date histogram.

in the below example you can see document counts returned for each different status type:

curl -XGET 'http://localhost:9200/myindex/mydata/_search?search_type=count&pretty' -d '
> {
>  "query" : {
>     "match_all" : { } 
>   },
>     "aggs" : {
>         "date_hist_agg" : {
>             "date_histogram" : {"field" : "timestamp", "interval" : "hour"},
>             "aggs" : {
>              "status_agg" : {
>                 "terms" : { "field" : "status" }
>             }
>           }
>        }     
>      }
> }'
{
  "took" : 213,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "date_hist_agg" : {
      "buckets" : [ {
        "key_as_string" : "2014-12-23T17:00:00.000Z",
        "key" : 1419354000000,
        "doc_count" : 2,
        "status_agg" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "active",
            "doc_count" : 1
          }, {
            "key" : "paused",
            "doc_count" : 1
          } ]
        }
      }, {
        "key_as_string" : "2014-12-23T18:00:00.000Z",
        "key" : 1419357600000,
        "doc_count" : 1,
        "status_agg" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "active",
            "doc_count" : 1
          } ]
        }
      } ]
    }
  }
}

answered Dec 23, 2014 at 17:19

Olly Cruickshank

6,2094 gold badges36 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Carmine Giangregorio Over a year ago

Hi, Thank you for your answer; is it possibile to do this via Elasticsearch Java API?

phirschybar Over a year ago

This should work with any of the client libraries as long as you are able to construct the body of the request.

Tsiyona Dershowitz · Accepted Answer · 2015-11-09 08:15:07Z

Using the same aggregation names used in the previous answer, I would do the following:

    public void yourSearch(String indexName, String typeName) {

        SearchResponse sr =  client.prepareSearch(indexName)
                .setTypes(typeName)
                .addAggregation(AggregationBuilders.dateHistogram("date_hist_agg")
                                .field("timestamp")
                                .interval(DateHistogram.Interval.hours((1)))
                                .minDocCount(0)
                        .subAggregation(AggregationBuilders.terms("status_agg").field("status")))
            .execute().actionGet();

        DateHistogram componentsAgg =  sr.getAggregations().get("date_hist_agg");
        for (DateHistogram.Bucket entry : componentsAgg.getBuckets()) {

            Terms statusAgg =  entry.getAggregations().get("status_agg");
            for (Terms.Bucket entry2 : statusAgg.getBuckets()) {
                String key = entry2.getKey();
                long cnt = entry2.getDocCount();

                // use the key,cnt

            }
        }
    }
}

Collectives™ on Stack Overflow

Multi DateHistogram aggregation on elasticsearch Java API

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related