1

I want to write dataframe from pyspark to azure blob? Any suggestions or code how to do it?

I have location and key of blob

enter image description here

2
  • Hi,does my answer helps you? Commented Jul 12, 2019 at 2:10
  • Yes in approach it helps but facing issue while writing the data as csv. Please find the link for the error in question part. Commented Jul 12, 2019 at 6:26

1 Answer 1

3

You could follow this tutorial to connector your spark dataframe with Azure Blob Storage.

Set connection info:

session.conf.set(
    "fs.azure.account.key.<storage-account-name>.blob.core.windows.net",
    "<your-storage-account-access-key>"
)

Then write data into blob storage:

sdf = session.write.parquet(
    "wasbs://<container-name>@<storage-account-name>.blob.core.windows.net/<prefix>"
)

Also,you could refer to this case:pyspark write to wasb blob storage container

Sign up to request clarification or add additional context in comments.

2 Comments

I tried the approach which you mentioned above and getting exception as "Caused by: java.lang.IllegalArgumentException: The String is not a valid Base64-encoded string."
I have attached a screenshot on the error message i got below the question. Please have a look on it and can you tell me what the exact error is?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.