14

I have a .csv file that I would like to render in a FastAPI app. I only managed to render the .csv file in JSON format as follows:

def transform_question_format(csv_file_name):

    json_file_name = f"{csv_file_name[:-4]}.json"

    # transforms the csv file into json file
    pd.read_csv(csv_file_name ,sep=",").to_json(json_file_name)

    with open(json_file_name, "r") as f:
        json_data = json.load(f)

    return json_data

@app.get("/questions")
def load_questions():

    question_json = transform_question_format(question_csv_filename)

    return question_json

When I tried returning directly pd.read_csv(csv_file_name ,sep=",").to_json(json_file_name), it works, as it returns a string.

How should I proceed? I believe this is not the good way to do it.

4
  • 1
    When you say render - what do you mean? In general, FastAPI returns data as JSON. If you want to have a different response format, you can use one of the built-in custom response formats, or create your own: fastapi.tiangolo.com/advanced/custom-response Commented Feb 21, 2022 at 9:19
  • 1
    Maybe check this stackoverflow.com/questions/32911336/…, but so far it seems good Commented Feb 21, 2022 at 9:40
  • I am ok with JSON output but problem is that i need this intermediate step of creating an output JSON file and then load it. Obviously i cannot import csv, transform and load in one step. thanks for the links. It clarifies a bit the process. Commented Feb 21, 2022 at 10:13
  • 1
    If you don't give a filename to to_json a JSON string is returned directly. You can then pair this with return Response(content=json_str, media_type="application/json") to return the string directly from FastAPI with a JSON header. Would that work? (you can also give a File-like object and get output written to that, so something like StringIO should work as well) Commented Feb 21, 2022 at 11:05

3 Answers 3

14

The below shows four different ways of returning the data stored in a .csv file/Pandas DataFrame (for solutions without using Pandas DataFrame, have a look here). Related answers on how to efficiently return a large dataframe can be found here and here as well.

Option 1

The first option is to convert the file data into JSON and then parse it into a dict. You can optionally change the orientation of the data using the orient parameter in the .to_json() method.

Note: Better not to use this option. See Updates below.

from fastapi import FastAPI
import pandas as pd
import json

app = FastAPI()
df = pd.read_csv("file.csv")

def parse_csv(df):
    res = df.to_json(orient="records")
    parsed = json.loads(res)
    return parsed
    
@app.get("/questions")
def load_questions():
    return parse_csv(df)
  • Update 1: Using .to_dict() method would be a better option, as it would return a dict directly, instead of converting the DataFrame into JSON (using df.to_json()) and then that JSON string into dict (using json.loads()), as described earlier. Example:

    @app.get("/questions")
    def load_questions():
        return df.to_dict(orient="records")
    
  • Update 2: When using .to_dict() method and returning the dict, FastAPI, behind the scenes, automatically converts that return value into JSON using the Python standard json.dumps(), after converting it into JSON-compatible data first, using the jsonable_encoder, and then putting that JSON-compatible data inside of a JSONResponse (see this answer for more details). Thus, to avoid that extra processing, you could still use the .to_json() method, but this time, put the JSON string in a custom Response and return it directly, as shown below:

    from fastapi import Response
    
    @app.get("/questions")
    def load_questions():
        return Response(df.to_json(orient="records"), media_type="application/json")
    

Option 2

Another option is to return the data in string format, using .to_string() method.

@app.get("/questions")
def load_questions():
    return df.to_string()

Option 3

You could also return the data as an HTML table, using .to_html() method.

from fastapi.responses import HTMLResponse

@app.get("/questions")
def load_questions():
    return HTMLResponse(content=df.to_html(), status_code=200)

Option 4

Finally, you can always return the file as is using FastAPI's FileResponse.

from fastapi.responses import FileResponse

@app.get("/questions")
def load_questions():
    return FileResponse(path="file.csv", filename="file.csv")
Sign up to request clarification or add additional context in comments.

5 Comments

You can avoid json dump/load sequence by just calling DataFrame.to_dict() instead.
@Chris by the way i saw you updated with async, and of course i have seen this into the documentation. In such a case would it be to allow several users to query the API at the same time ?
@pac You could also have a look at this answer, if it helps clarify things about async for you.
to_csv is a string. anyway to get that to an array easily (which includes the header on the first row)
Option 1, Update 2 is the best way to do it - If the DataFrame contains any np.nan values, to_dict will cause the JSON conversion to fail in FastAPI
1

With the DataFrame.to_dict() method, not all Pandas datatypes are serializable by the json package:

df = pd.DataFrame({
    "TrainID": ["T001", "T002", "T003"],
    "Route": ["Amsterdam - Utrecht", "Rotterdam - Den Haag", "Eindhoven - Tilburg"],
    "DepartureTime": [
        pd.Timestamp("2022-03-09 08:00:00"),
        pd.Timestamp("2022-03-09 09:15:00"),
        pd.Timestamp("2022-03-09 10:30:00"),
    ],
    "ArrivalTime": [
        pd.Timestamp("2022-03-09 09:00:00"),
        pd.Timestamp("2022-03-09 09:45:00"),
        pd.Timestamp("2022-03-09 11:00:00"),
    ],
    "Status": ["On Time", "Delayed", "Cancelled"],
})

json.dumps(df.to_dict(orient="records"))

TypeError: Object of type Timestamp is not JSON serializable

I have a DataFrameJSONResponse class to use the pandas.DataFrame.to_json instead of the json.dumps:

from fastapi.responses import Response
from typing import Any

class DataFrameJSONResponse(Response):
    media_type = "application/json"

    def render(self, content: Any) -> bytes:
        return content.to_json(orient="records", date_format='iso').encode("utf-8")


@app.get("/test", response_class=DataFrameJSONResponse)
async def test_dataframe():
    df = pd.DataFrame(
        {
            "TrainID": ["T001", "T002", "T003"],
            "Route": [
                "Amsterdam - Utrecht",
                "Rotterdam - Den Haag",
                "Eindhoven - Tilburg",
            ],
            "DepartureTime": [
                pd.Timestamp("2022-03-09 08:00:00"),
                pd.Timestamp("2022-03-09 09:15:00"),
                pd.Timestamp("2022-03-09 10:30:00"),
            ],
            "ArrivalTime": [
                pd.Timestamp("2022-03-09 09:00:00"),
                pd.Timestamp("2022-03-09 09:45:00"),
                pd.Timestamp("2022-03-09 11:00:00"),
            ],
            "Status": ["On Time", "Delayed", "Cancelled"],
        }
    )

    return DataFrameJSONResponse(df)

Comments

0

In addition to the accepted answer, I find that the package jsonpickle provides a simple solution to the issue of returning a DataFrame or Numpy ndarray to the client side.

On the server side:

import jsonpickle.ext.pandas as jsonpickle_pandas
from jsonpickle.pickler import Pickler

jsonpickle_pandas.register_handlers()

@app.post("/question")
def answer_question():
    df = pd.DataFrame(...)
    p = Pickler()
    # convert the dataframe to a json-compatible dictionary
    response = p.flatten(df)
    # let FastAPI do the json conversion
    return response
    

On the client side:

from jsonpickle.unpickler import Unpickler
import jsonpickle.ext.pandas as jsonpickle_pandas
jsonpickle_pandas.register_handlers()

response = requests.post("http://localhost:8000/question/")
res = response.json()

u = Unpickler()
df = u.restore(res)

The advantage of this solution is that Pickler/Unpickler handles dates and periods. In fact, it will turn any object into a dictionary of jsonable items, up to a user-specified depth.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.