Trouble Parsing data from JSON response with Python/Pandas Library

Question

I am reading in a JSON response from a web API, which is returning time series currency data, I need to be able to isolate just the currencies and then perform an average calculation on them.

API Return:

                                          rates
2016-03-01  {u'USD': 0.66342297, u'AUD': 0.92449052}
2016-03-02  {u'USD': 0.66676313, u'AUD': 0.91501037}
2016-03-03    {u'USD': 0.67240633, u'AUD': 0.914753}
2016-03-04  {u'USD': 0.68185522, u'AUD': 0.91650478}
2016-03-05  {u'USD': 0.68185522, u'AUD': 0.91650478}
2016-03-06  {u'USD': 0.68073566, u'AUD': 0.91793187}
2016-03-07   {u'USD': 0.6794346, u'AUD': 0.90979962}
2016-03-08  {u'USD': 0.67392847, u'AUD': 0.90683613}
2016-03-09  {u'USD': 0.66438164, u'AUD': 0.88859516}
2016-03-10     {u'USD': 0.66666, u'AUD': 0.89461305}
2016-03-11  {u'USD': 0.67452488, u'AUD': 0.89174887}
2016-03-12  {u'USD': 0.67452488, u'AUD': 0.89174887}
2016-03-13  {u'USD': 0.67358755, u'AUD': 0.89251092}
2016-03-14   {u'USD': 0.6667529, u'AUD': 0.88783949}
2016-03-15  {u'USD': 0.66084856, u'AUD': 0.88557738}
2016-03-16  {u'USD': 0.67423336, u'AUD': 0.89318458}
2016-03-17  {u'USD': 0.68315297, u'AUD': 0.89391181}
2016-03-18  {u'USD': 0.67954772, u'AUD': 0.89359166}
2016-03-19  {u'USD': 0.67983322, u'AUD': 0.89388959}
2016-03-20  {u'USD': 0.67951586, u'AUD': 0.89439032}
2016-03-21    {u'USD': 0.67690921, u'AUD': 0.892827}
2016-03-22  {u'USD': 0.67500204, u'AUD': 0.88599621}
2016-03-23  {u'USD': 0.67137479, u'AUD': 0.89131852}
2016-03-24  {u'USD': 0.66980223, u'AUD': 0.89002584}
2016-03-25   {u'USD': 0.6686168, u'AUD': 0.89045449}
2016-03-26   {u'USD': 0.6686168, u'AUD': 0.89045449}
2016-03-27   {u'USD': 0.66853276, u'AUD': 0.8903994}
2016-03-28  {u'USD': 0.67270532, u'AUD': 0.89168637}
2016-03-29  {u'USD': 0.68576241, u'AUD': 0.89832338}
2016-03-30  {u'USD': 0.69112465, u'AUD': 0.90136407}
2016-03-31  {u'USD': 0.69193139, u'AUD': 0.90265425}

Python Code:

urlread = url + api_id  + '&start=' + startdate + '&end=' + enddate +           '&base=' + base + '&symbols=' + symbols + '&prettyprint=false'
   print(urlread)
   #Reads in response from URL
   result = pd.read_json(urlread, orient="records")
   #Removes Columns not required
   del result['base']
   del result['license']
   del result['disclaimer']
   del result['start_date']
   del result['end_date']
   #del json['rates']
   #Prints output of JSON to screen for troubleshooting, can be commented out
  print(result)
  #Writes JSON output to CSV file and formats Date and Removes Headers
  with open("Historical.csv", "w") as output:
      result.to_csv(output, date_format='%d/%m/%Y', header = None)

Output from CSV:

1/03/2016   {u'USD': 0.6634229700000001, u'AUD': 0.92449052}
2/03/2016   {u'USD': 0.66676313, u'AUD': 0.9150103700000001}
3/03/2016   {u'USD': 0.67240633, u'AUD': 0.9147529999999999}
4/03/2016   {u'USD': 0.68185522, u'AUD': 0.91650478}
5/03/2016   {u'USD': 0.68185522, u'AUD': 0.91650478}
6/03/2016   {u'USD': 0.68073566, u'AUD': 0.91793187}
7/03/2016   {u'USD': 0.6794346, u'AUD': 0.90979962}
8/03/2016   {u'USD': 0.67392847, u'AUD': 0.9068361300000001}
9/03/2016   {u'USD': 0.66438164, u'AUD': 0.88859516}
10/03/2016  {u'USD': 0.66666, u'AUD': 0.89461305}

So it's just dumping the output and I can't seem to parse it to remove the JSON formatting. In reality I need to average all the currencies before dumping to a csv. How can I achieve that?

What you have there is no JSON, it is the representation of a Python (2) dictionary. Any JSON parser will fail on this. — Klaus D.
– Klaus D., Commented Apr 5, 2016 at 6:21

jezrael · Accepted Answer · 2016-04-05 06:49:29Z

1

I think you can use DataFrame constructor:

print result
                                               rates
2016-03-01  {u'USD': 0.66342297, u'AUD': 0.92449052}

result = pd.DataFrame([x for x in result.rates], index=result.index)
print result
                 AUD       USD
2016-03-01  0.924491  0.663423

Instead of del you can use drop:

result = result.drop(['base','license','disclaimer','start_date','end_date'], axis=1)

If you want write to_csv, with open can be omited:

result.to_csv("Historical.csv", date_format='%d/%m/%Y', header = None)

edited Apr 5, 2016 at 6:49

answered Apr 5, 2016 at 6:38

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

C. Thompson Over a year ago

Perfect, Thanks for your help, that's exactly what I needed. I didn't realise that Pandas had already read it into a python dict. Putting it into a dataframe solved the issue and I can now transform the data as required

C. Thompson · Accepted Answer · 2016-04-05 09:29:35Z

0

Genius!!,

That's what I got stuck on, didn't realise I was actually printing the Python Dict and not raw JSON.

jezrael - your answer of creating a new Dataframe worked, also thanks for the tipe to use drop and omit with open. Looks much better now.

I have now been able to get the averages for each column, and build a new dataframe which outputs to csv in the format I need.

Thankyou - problem Solved!

answered Apr 5, 2016 at 9:29

C. Thompson

333 bronze badges

Collectives™ on Stack Overflow

Trouble Parsing data from JSON response with Python/Pandas Library

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related