I'm a bit new to aggregations and I want to create an equivalent to the following SQL:
select fullname, natcode, count(1) from table where birthdate = '18-sep-1993' group by fullname, natcode having count(1) > 2 order by count(1) desc
So, if I have the following data:

As you can see, the results are grouped by fullname and natcode, have count>2 and are ordered by count
I've managed to form the following query:
{
"size": 0,
"aggs": {
"profs": {
"filter": {
"term": {
"birthDate": "18-Sep-1993"
}
},
"aggs": {
"name_count": {
"terms": {
"field": "fullName.raw"
},
"aggs": {
"nat_count": {
"terms": {
"field": "natCode"
},
"aggs": {
"my_filter": {
"bucket_selector": {
"buckets_path": {
"the_doc_count": "_count"
},
"script": {
"source": "params.the_doc_count>2"
}
}
}
}
}
}
}
}
}
}
}
What is achieved: It is filtering on date, creating bucket on fullname (name_count) and sub-bucket on natcode (nat_count) and filtering natcode bucket on doc count.
The problem with this: I can see empty name_count buckets also. I only want buckets that have the required count. Following is the sample of results
"aggregations": {
"profs": {
"doc_count": 3754,
"name_count": {
"doc_count_error_upper_bound": 4,
"sum_other_doc_count": 3732,
"buckets": [
{
"key": "JOHN SMITH",
"doc_count": 3,
"nat_count": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "111",
"doc_count": 3
}
]
}
},
{
"key": "MIKE CAIN",
"doc_count": 3,
"nat_count": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "205",
"doc_count": 3
}
]
}
},
{
"key": "JULIA ROBERTS",
"doc_count": 2,
"nat_count": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
},
{
"key": "JAMES STEPHEN COOK",
"doc_count": 2,
"nat_count": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
In the results, I don't want the last two names (JULIA ROBERTS and JAMES STEPHEN COOK) to show up
Additionally what is missing: The ordering on the group count at the end. I'd want the group (fullname, natcode) with the most count to show up
Required further ahead: The grouping needs to be done on a couple of more fields, so they'd be like 4 fields.
Please excuse if I might have used any wrong terms. Hopefully you get the idea of what help is required. Thanks
