0

I'm working with a huge (5 million documents) ElasticSearch database and I need to fetch data using sliced scroll in python. Question is: if there is some way to limit (set size param) the sliced scroll? I tried to set size param by [search obj].param(size=500000) or [:500000] but it doesn't seem to work - sliced scroll gives me all documents.

In my script, I'm using sliced scroll with python multiprocessing like in here: https://github.com/elastic/elasticsearch-dsl-py/issues/817

Is there some way to get for example 500000 documents using sliced scroll?

Thanks in advance.

1

1 Answer 1

1

Answer from github:

"There is no limit on scroll, it always returns all documents. To only get a subset simply stop consuming the iterator after you get the number you wanted to retrieve by using a break statement or similar."

https://github.com/elastic/elasticsearch-dsl-py/issues/817

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.