2

I am working with Python and Pyspark, and I want to upload a CSV file to an azure blob storage. I have already a dataframe generated by code: df. What I want to do is the next:

# Dataframe generated by code
df

# Create the BlockBlockService that is used to call the Blob service for the storage account
block_blob_service = BlockBlobService(account_name='name', account_key='key') 

container_name ='results-csv'

d = {'one' : pandas.Series([1., 2., 3.], index=['a', 'b', 'c']), 'two' : pandas.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
df = pandas.DataFrame(d)


writer = pandas.ExcelWriter(df, engine='xlsxwriter')

a = df.to_excel(writer, sheet_name='Sheet1', index=False, engine='xlsxwriter')


block_blob_service.create_blob_from_stream(container_name, 'test', a)

I get the error:

ValueError: stream should not be None.

So I want to upload the content of the dataframe as a blob to the storage location provided above. Is there any way to do that without first generating a CSV file in my local computer?

4
  • Whatever how you create that CSV file, you can just save it into a BytesIO, it is almost the same as save to a file. And then you can upload it as stream or bytes. Commented Jun 19, 2018 at 7:46
  • Can you put an example as an answer, please? Commented Jun 19, 2018 at 7:48
  • I will edit my question to be more explicit. Commented Jun 19, 2018 at 8:13
  • 4
    You can use a = df.to_csv() and block_blob_service.create_blob_from_text(container_name, "test.csv", a) Commented Jun 19, 2018 at 8:16

1 Answer 1

3

What we intent to do is using dataset.to_csv function create a file stream and then send that stream to azure blob. The alternative to this is we directly store the the string dataset to azure. Code :

    blob_client = service.get_blob_client(container=container_name, blob=local_file_name)
    print(str(dataset.to_csv()))
    blob_client.upload_blob(str(dataset.to_csv()))

This will store the file into blob. Any other solution is not working as of now. Still the issue being now the data is blob is not in csv format that part we still need to figure out.

Edit : Added the code to send it in csv format

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.