Using Pyspark to read in Elasticsearch Data to Dataframe

I am trying to use the Python ElasticSearch library to read in elasticsearch documents and place them in a spark dataframe. I am able to connect and query using the scan helper function since the query will generate about 2M documents(rows in my dataframe). The issue I am running into is getting the query into a spark dataframe.

This code produces a generator:

result = elasticsearch.helpers.scan(es, index=index, doc_type='_doc', query=query)

I was trying to use a for loop to fill a to collect the generated data into a dictionary:

data = {}
for item in result:
  data.append((item['_source']['someField'], item['_source']['someField']))
return data

but I run into errors as I do not think a dictionary can append in this way.

Is there a better way to collect this generated data into a spark dataframe? Note:I am also working on the Databricks platform if that helps.

asked Mar 18, 2021 at 15:52

Prof. Falken

5591 gold badge7 silver badges24 bronze badges

What is the error that you got ?

Nassereddine BELGHITH
– Nassereddine BELGHITH

2021-03-18 16:50:26 +00:00
Commented Mar 18, 2021 at 16:50
Dict doesn't support append when I try the for.loop.

Prof. Falken
– Prof. Falken

2021-03-18 17:02:57 +00:00
Commented Mar 18, 2021 at 17:02
read this stackoverflow.com/questions/38162901/…

Nassereddine BELGHITH
– Nassereddine BELGHITH

2021-03-18 19:27:45 +00:00
Commented Mar 18, 2021 at 19:27

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Using Pyspark to read in Elasticsearch Data to Dataframe

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked