6

I am trying to upload a csv file with FastAPI and then load it into pandas.

import pandas as pd
import os
import io, base64

from fastapi import FastAPI, File, UploadFile, Form

app = FastAPI()

@app.post('/uploadfile/')
async def create_data_file(
        experiment: str = Form(...),
        file_type: str = Form(...),
        file_id: str = Form(...),
        data_file: UploadFile = File(...),
        ):
    
    #decoded = base64.b64decode(data_file.file)
    #decoded = io.StringIO(decoded.decode('utf-8'))
    
    print(pd.read_csv(data_file.file, sep='\t'))

    return {'filename': data_file.filename, 
            'experiment':experiment, 
            'file_type': file_type, 
            'file_id': file_id}

I tried using the file.file content directly or converting it with base64 or StringIO. I also tried codec. The error I get with the example code is

AttributeError: 'SpooledTemporaryFile' object has no attribute 'readable'
3
  • You do not want to save the file? Commented Feb 3, 2021 at 13:33
  • Not as csv. I want to convert it to parquet. Commented Feb 3, 2021 at 18:33
  • I would be curious to see an example of the file that failed. The method above works for me. (only difference is my test files are not tab-separated) Commented Feb 4, 2021 at 17:43

2 Answers 2

7

Changing the encoding to what suits you best, I found this workaround:

from io import StringIO
 
pd.read_csv(StringIO(str(data_file.file.read(), 'utf-16')), encoding='utf-16')
Sign up to request clarification or add additional context in comments.

Comments

3

This is a workaround using libraries csv and codecs to create the records which then can be turned into a pandas dataframe:

def to_df(file):
    data = file.file
    data = csv.reader(codecs.iterdecode(data,'utf-8'), delimiter='\t')
    header = data.__next__()
    df = pd.DataFrame(data, columns=header)
    return df

1 Comment

This works great on my side but I need to use the second row of my csv as my Dataframe header. How can I choose which row will be the header? If I do header = data.__next__() it will by default take the first row of my csv as header

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.