ElasticSearch count multiple fields grouped by

Question

I have documents like

{"domain":"US", "zipcode":"11111", "eventType":"click", "id":"1", "time":100}

{"domain":"US", "zipcode":"22222", "eventType":"sell", "id":"2", "time":200}

{"domain":"US", "zipcode":"22222", "eventType":"click", "id":"3","time":150}

{"domain":"US", "zipcode":"11111", "eventType":"sell", "id":"4","time":350}

{"domain":"US", "zipcode":"33333", "eventType":"sell", "id":"5","time":225}

{"domain":"EU", "zipcode":"44444", "eventType":"click", "id":"5","time":120}

I want to filter these documents by eventType=sell and time between 125 and 400, group by domain followed by zipcode and count the documents in each bucket. So my output would be like (first and last docs would be ignored by the filters)

US, 11111,1

US, 22222,1

US, 33333,1

In SQL, this should have been straightforward. But I am not able to get this to work on ElasticSearch. Could someone please help me out here?

How do I write ElasticSearch query to accomplish the above?

Sloan Ahrens · Accepted Answer · 2015-12-10 00:57:28Z

This query seems to do what you want:

POST /test_index/_search
{
   "size": 0,
   "query": {
      "filtered": {
         "filter": {
            "bool": {
               "must": [
                  {
                     "term": {
                        "eventType": "sell"
                     }
                  },
                  {
                     "range": {
                        "time": {
                           "gte": 125,
                           "lte": 400
                        }
                     }
                  }
               ]
            }
         }
      }
   },
   "aggs": {
      "zipcode_terms": {
         "terms": {
            "field": "zipcode"
         }
      }
   }
}

returning

{
   "took": 8,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "zipcode_terms": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "11111",
               "doc_count": 1
            },
            {
               "key": "22222",
               "doc_count": 1
            },
            {
               "key": "33333",
               "doc_count": 1
            }
         ]
      }
   }
}

(Note that there is only 1 "sell" at "22222", not 2).

Here is some code I used to test it:

http://sense.qbox.io/gist/1c4cb591ab72a6f3ae681df30fe023ddfca4225b

You might want to take a look at terms aggregations, the bool filter, and range filters.

EDIT: I just realized I left out the domain part, but it should be straightforward to add in a bucket aggregation on that as well if you need to.

Collectives™ on Stack Overflow

ElasticSearch count multiple fields grouped by

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related