3

Converting a csv (of 50k rows) to json for eventual consumption by a Django template is quite slow. I was wondering if I was converting it correctly or if there's a better way to do this.

First few rows of the csv are:

tdate,lat,long,entity
3/6/2017,34.152568,-118.347831,x1
6/3/2015,34.069787,-118.384738,y1
1/1/2011,34.21377,-118.227904,x1
3/4/2013,33.81761,-118.070374,y1

Am reading this csv in views and rendering the request this way:

def index(request):     
    df = pd.read_csv('app/static/data.csv') 
    df.tdate=pd.to_datetime(df.tdate)
    df['Qrt'] = df.tdate.dt.quarter
    df['Yr'] = df.tdate.dt.year
    jzon=df.groupby('entity')[['lat','long','Qrt','Yr']].apply(lambda x: x.to_dict('records')).to_json(orient='columns')    
    return render(request, 'app/index.html', {'jzon': jzon})

{"x1":[{"lat":34.152568,"long":-118.347831,"Qrt":1.0,"Yr":2017.0},{"lat":34.21377,"long":-118.227904,"Qrt":1.0,"Yr":2011.0}],"y1":[{"lat":34.069787,"long":-118.384738,"Qrt":2.0,"Yr":2015.0},{"lat":33.81761,"long":-118.070374,"Qrt":1.0,"Yr":2013.0}]}
0

1 Answer 1

3

The fastest way to do something is usually to avoid doing it, so maybe you could just save the generated json to a data.json file in your app/static directory, moving your current code to a custom management command that you execute as part of your deployment process.

Custom management commands are python scripts that can be executed from the command line using ./manage.py <yourcommandname> .... This is documented here: https://docs.djangoproject.com/en/2.0/howto/custom-management-commands/

In this case you're command's handle method would be responsible for converting the csv to json (using the code that's currently in your view) and storing it to a data.json file in your app/static folder. Then your view would just have to json.load() this data.json file and serve it.

Then all you have to do is to make sure this command is called whenever you update your csv file. This can be done manually if you have no deployment script (just don't forget to document it in your deployment procedure documentation), or automated in your deployment script to make sure you won't have stale data.

Sign up to request clarification or add additional context in comments.

6 Comments

thanks for the pointer. am not sure I understand this part of your comment: '... moving your current code to a custom management command that you execute as part of your deployment process'. pls elaborate.
@schmoozed what is the part you're not sure about ? "moving the code to a custom management command" or "that you execute as part of your deployment process" ?
let me give it a try. instead a of passing the json while rendering the request, I can store it and then the java script in the frontend read from there. and this would be faster?
great thanks for your explanation. upvoted it as am also looking for a quicker (or better) pandas way to generate the json.
@schmoozed the idea is indeed to avoid reparsing the same csv file over and over (since it's in your app/static dir I assume this is not something that changes often - else it should NOT be in app/static). It could indeed be served directly by the front server like just any other static file.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.