I have a collection of documents stored in elasticsearch, they look like this:
{
"id": "12312312",
"timestamp": "2015-11-01T00:00:00.000",
"unit": {
"id": "123456",
"name": "unit-4"
},
"samples": [
{
"value": 244.05435180062133,
"aggregation": "M",
"type": {
"name": "SomeName1",
"display": "Some name 1"
}
},
{
"value": 251.19450064653438,
"aggregation": "I",
"type": {
"name": "SomeName2",
"display": "Some name 2"
}
},
...
]
}
I would like to run an aggregation query against it which would return counts of unit.id per buckets for property samples.value,
query should be based on samples.type.name and samples.aggregation. I've produced something like this:
{
"query": {
"bool": {
"must": [{
"range": {
"timestamp": {
"gte": "2015-11-01T00:00:00.000",
"lte": "2015-11-30T23:59:59.999",
"format": "date_hour_minute_second_fraction"
}
}
}, {
"nested": {
"path": "samples",
"query": {
"bool": {
"must": [{
"match": {
"samples.type.name": "SomeName1"
}
}]
}
}
}
}]
}
},
"aggs": {
"0": {
"nested": {
"path": "samples"
},
"aggs": {
"1": {
"histogram": {
"field": "samples.value",
"interval": 10
}
}
}
}
}
}
And I'm querying http://localhost:9200/dc/sample/_search?search_type=count&pretty . But this returns counts of nested documents in samples array.
But I need to count distinct unit.id per bucket...
Can you guys help me please?
Edit: added mapping
{
"dc" : {
"mappings" : {
"sample" : {
"unit" : {
"properties" : {
"name" : {
"type" : "string"
}}},
"samples" : {
"type" : "nested",
"properties" : {
"aggregation" : {
"type" : "string"
},
"type" : {
"properties" : {
"display" : {
"type" : "string"
},
"name" : {
"type" : "string"
}
}
},
"value" : {
"type" : "double"
}
}
},
"timestamp" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
}}}}}
}
Edit I'll try to rephrase it...I want to get count of units per bucket defined by "histogram_samples_value". That means sum of this counts should be total number of units. And to test it I wrote a query which filters only one unit (many documents with different sample values) - all but one "histogram_samples_value" buckets should contain count=0 and one bucket should contain count = 1 .
histogramaggregation? Your requirements do not seem to need it at all. Also, could you add minimum expected output?