4

I'm using dateHistogram aggregation with ElasticSearch Java API, and it works pretty well for simple aggregations, such as the number of hits per hour/day/month/year (imagine a series of documents, where the date histogram aggregation is made on 'indexed_date' field).

But, can I, with a single query, make a multi-field date aggregation, in relation to another field? Something like what Kibana does for charts.

An example of what I would like to achieve:

I have a series of documents, where each one is an "event", which has its timestamp. These documents have a series of fields, like "status", "version", etc.

Can I get an aggregation, based on date histogram, on timestamp field and on all values of another field?

Example result of aggregation with a one hour interval:

H: 12 status - { ACTIVE: 34 PAUSED: 12 }

H: 13 status - { ACTIVE: 10 }

EDIT:

Some sample data:

"doc1" - { timestamp: "2014-12-23 12:01", status: "ACTIVE", version: 1 }
"doc2" - { timestamp: "2014-12-23 12.15", status: "PAUSED", version: 1 }
"doc3" - { timestamp: "2014-12-23 13.55", status: "ACTIVE", version: 2 }
(and so on..)
2
  • 1
    Just to confirm what you're looking for - you want to have hourly buckets (date histogram) and each bucket contains a count of something? e.g. a count of fields with "active": true, or "paused": true ? if you could add some data to the question it would be easier to figure it out. Commented Dec 23, 2014 at 16:18
  • Yes, this is what I'm looking for. I'm editing the question to add a bit more data samples. Commented Dec 23, 2014 at 16:33

2 Answers 2

4

I would do a term aggregation inside the date histogram.

in the below example you can see document counts returned for each different status type:

curl -XGET 'http://localhost:9200/myindex/mydata/_search?search_type=count&pretty' -d '
> {
>  "query" : {
>     "match_all" : { } 
>   },
>     "aggs" : {
>         "date_hist_agg" : {
>             "date_histogram" : {"field" : "timestamp", "interval" : "hour"},
>             "aggs" : {
>              "status_agg" : {
>                 "terms" : { "field" : "status" }
>             }
>           }
>        }     
>      }
> }'
{
  "took" : 213,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "date_hist_agg" : {
      "buckets" : [ {
        "key_as_string" : "2014-12-23T17:00:00.000Z",
        "key" : 1419354000000,
        "doc_count" : 2,
        "status_agg" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "active",
            "doc_count" : 1
          }, {
            "key" : "paused",
            "doc_count" : 1
          } ]
        }
      }, {
        "key_as_string" : "2014-12-23T18:00:00.000Z",
        "key" : 1419357600000,
        "doc_count" : 1,
        "status_agg" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "active",
            "doc_count" : 1
          } ]
        }
      } ]
    }
  }
}
Sign up to request clarification or add additional context in comments.

2 Comments

Hi, Thank you for your answer; is it possibile to do this via Elasticsearch Java API?
This should work with any of the client libraries as long as you are able to construct the body of the request.
1

Using the same aggregation names used in the previous answer, I would do the following:

    public void yourSearch(String indexName, String typeName) {

        SearchResponse sr =  client.prepareSearch(indexName)
                .setTypes(typeName)
                .addAggregation(AggregationBuilders.dateHistogram("date_hist_agg")
                                .field("timestamp")
                                .interval(DateHistogram.Interval.hours((1)))
                                .minDocCount(0)
                        .subAggregation(AggregationBuilders.terms("status_agg").field("status")))
            .execute().actionGet();

        DateHistogram componentsAgg =  sr.getAggregations().get("date_hist_agg");
        for (DateHistogram.Bucket entry : componentsAgg.getBuckets()) {

            Terms statusAgg =  entry.getAggregations().get("status_agg");
            for (Terms.Bucket entry2 : statusAgg.getBuckets()) {
                String key = entry2.getKey();
                long cnt = entry2.getDocCount();

                // use the key,cnt

            }
        }
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.