Below are two mocked records from my elasticsearch index. I have millions of records in my ES. I am trying to query ES to get all the records that have non-empty/ non-null "tags" field. If a record doesn't have a tag ( like second record below) then I don't want to pull it from ES.
If "books" were not nested then googling around seems like the below query would have worked -
curl -XGET 'host:port/book_indx/book/_search?' -d '{
"query" : {"filtered" : {"filter" : {"exists" :{"field" : "_source"}}}}
}'
However I am not finding a solution to query the nested structure. I tried the below with no luck -
{"query" : {"filtered" : {"filter" : {"exists" :{"field" : "_source.tags"}}}}}
{"query" : {"filtered" : {"filter" : {"exists" :{"field" : "_source":{"tags"}}}}}}
Any suggestions are really appreciated here! Thanks in advance.
{
"_shards": {
"failed": 0,
"successful": 12,
"total": 12
},
"hits": {
"hits": [
{
"_id": "book1",
"_index": "book",
"_source": {
"book_name": "How to Get Organized",
"publication_date": "2014-02-24T16:50:39+0000",
"tags": [
{
"category": "self help",
"topics": [
{
"name": "time management",
"page": 6198
},
{
"name": "calendar",
"page": 10
}
],
"id": "WEONWOIR234LI",
}
],
"last_updated": "2015-11-11T16:28:32.308+0000"
},
"_type": "book"
},
{
"_id": "book2",
"_index": "book",
"_source": {
"book_name": "How to Cook",
"publication_date": "2014-02-24T16:50:39+0000",
"tags": [],
"last_updated": "2015-11-11T16:28:32.308+0000"
},
"_type": "book"
}
],
"total": 1
},
"timed_out": false,
"took": 80
}
Mapping -
"book": {
"_id": {
"path": "message_id"
},
"properties": {
"book_name": {
"index": "not_analyzed",
"type": "string"
},
"publication_date": {
"format": "date_time||date_time_no_millis",
"type": "date"
},
"tags": {
"properties": {
"category": {
"index": "not_analyzed",
"type": "string"
},
"topic": {
"properties": {
"name": {
"index": "not_analyzed",
"type": "string"
},
"page": {
"index": "no",
"type": "integer"
}
}
},
"id": {
"index": "not_analyzed",
"type": "string"
}
},
"type": "nested"
},
"last_updated": {
"format": "date_time||date_time_no_millis",
"type": "date"
}
}
}
booktype? Is thetagsfield anestedfield or a normalobjectfield? I'm also surprised to not see the_sourcein your documents.