I want get the data from my elasticsearch node for my code, i am using elasticsearch-dsl library to query the data from elasticsearch. Now i want the data to be sorted according to the "@timestamp" which can done using sort api. But the data that i am getting back has more than 10000 documents. I cannot use scan with sort to get large data as with sort doesn't work with scan in elasticsearch-dsl. Is there a way to use scroll api in elasticsearch-dsl or any other way to get more than 10000 document sorted with "@timestamp".
1 Answer
scroll does work with sort, you just need to call it with preserve_order: s.params(preserve_order=True).scan()
Hope this helps!
10 Comments
S.Kumar
Its showing this error :- "ScanError: Scroll request has failed on 30 shards out of 32" when i am using the above setting
Honza Král
What is the error that you are getting? Catch the exception and print its
.info propertyS.Kumar
"error:Scroll request has failed on 38 shards out of 41" this is error i am getting.
Honza Král
That's just the message, please catch the exception and print out its
.info property. This is just telling you what went wrong, not why, it is of no helpS.Kumar
this is the traceback:- Traceback (most recent call last): File "check_dsl.py", line 41, in run_query for hit in response: File "/usr/local/lib/python2.7/dist-packages/elasticsearch_dsl/search.py", line 701, in scan **self._params File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 316, in scan (resp['_shards']['failed'], resp['_shards']['total']) ScanError: Scroll request has failed on 41 shards out of 44.
|