0

I have created an Azure Blob Storage Trigger in Azure function in python. A CSV file adds in blob storage and I try to read it with pandas.

import logging
import pandas as pd

import azure.functions as func


def main(myblob: func.InputStream):
    logging.info(f"Python blob trigger function processed blob \n"
                 f"Name: {myblob.name}\n"
                 f"Blob Size: {myblob.length} bytes")

    df_new = pd.read_csv(myblob)
    print(df_new.head())

If I pass myblob to pd.read_csv, then I get UnsupportedOperation: read1

Python blob trigger function processed blob 
Name: samples-workitems/Data_26112022_080027.csv
Blob Size: None bytes
[2022-11-27T16:19:25.650Z] Executed 'Functions.BlobTrigger1' (Failed, Id=2df388f5-a8dc-4554-80fa-f809cfaeedfe, Duration=1472ms)
[2022-11-27T16:19:25.655Z] System.Private.CoreLib: Exception while executing function: Functions.BlobTrigger1. System.Private.CoreLib: Result: Failure
Exception: UnsupportedOperation: read1

If I pass myblob.read(),

df_new = pd.read_csv(myblob.read())

it gives TypeError: Expected file path name or file-like object, got <class 'bytes'> type

Python blob trigger function processed blob 
Name: samples-workitems/Data_26112022_080027.csv
Blob Size: None bytes
[2022-11-27T16:09:56.513Z] Executed 'Functions.BlobTrigger1' (Failed, Id=e3825c28-7538-4e30-bad2-2526f9811697, Duration=1468ms)
[2022-11-27T16:09:56.518Z] System.Private.CoreLib: Exception while executing function: Functions.BlobTrigger1. System.Private.CoreLib: Result: Failure
Exception: TypeError: Expected file path name or file-like object, got <class 'bytes'> type

From Azure functions Docs:

InputStream is File-like object representing an input blob.

From Pandas read_csv Docs:

read_csv takes filepath_or_bufferstr, path object or file-like object

So technically I should read this object. What piece of puzzle am I missing here?

5
  • The pd.read_csv function should get a file name with path. What does myblob contain? Commented Nov 27, 2022 at 16:38
  • I uploaded Data_26112022_080027.csv Commented Nov 27, 2022 at 16:40
  • Python blob trigger function processed blob Name: samples-workitems/Data_26112022_080027.csv Blob Size: None bytes Commented Nov 27, 2022 at 16:40
  • This is the output before exception occured. Commented Nov 27, 2022 at 16:41
  • I added the output to question as well :) Commented Nov 27, 2022 at 16:43

1 Answer 1

1

If you refer to this article, it says that this piece of code will work. But this is recommended for smaller files as the entire files goes into memory. Not recommended to be used for larger files.

import logging
import pandas as pd

import azure.functions as func
from io import BytesIO

def main(myblob: func.InputStream):
    logging.info(f"Python blob trigger function processed blob \n"
                 f"Name: {myblob.name}\n"
                 f"Blob Size: {myblob.length} bytes")
    df_new = pd.read_csv(BytesIO(myblob.read()))
    print(df_new.head())
Sign up to request clarification or add additional context in comments.

1 Comment

my files are smaller than 10mbs, so I guess no memory issue for me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.