4

I am using a api to get 3 json data and I would like to add those datas to 1 panda dataframes

This is my code I am passing in books which contains the book id as x and those 3 id returns me 3 different json objects with all the book information.

for x in books:
newDF = pd.DataFrame()
bookinfo = requests.get( http://books.com/?x})
    books = bookinfo.json() 
    print(books)

This is the 3 arrays I get after printing books,

{  
   u'bookInfo':[  
      {  
         u'book_created':u'2017-05-31',
         u'book_rating':3,
         u'book_sold':0
      },
      {  
         u'book_created':u'2017-05-31',
         u'book_rating':2,
         u'book_sold':1
      },
   ],
   u'book_reading_speed':u'4.29',
   u'book_sale_date':u'2017-05-31'
}
{  
   u'bookInfo':[  
      {  
         u'book_created':u'2017-05-31',
         u'book_rating':3,
         u'book_sold':0
      },
      {  
         u'book_created':u'2017-05-31',
         u'book_rating':2,
         u'book_sold':1
      },
   ],
   u'book_reading_speed':u'4.29',
   u'book_sale_date':u'2017-05-31'
}
{  
   u'bookInfo':[  
      {  
         u'book_created':u'2017-05-31',
         u'book_rating':3,
         u'book_sold':0
      },
      {  
         u'book_created':u'2017-05-31',
         u'book_rating':2,
         u'book_sold':1
      },
   ],
   u'book_reading_speed':u'4.29',
   u'book_sale_date':u'2017-05-31'
}    

What I would like to do is only take u'bookInfo from the three arrays and make them into 1 dataframe

1
  • Could you provide expected output as well? Commented Jun 23, 2017 at 19:44

1 Answer 1

4

IIUC:

pd.concat(
    pd.DataFrame([requests.get( http://books.com/?x}).json() for x in books]),
    ignore_index=True)

Alternatively you can collect JSON responses into a list and do the following:

In [30]: pd.concat([pd.DataFrame(x['bookInfo']) for x in d], ignore_index=True)
Out[30]:
  book_created  book_rating  book_sold
0   2017-05-31            3          0
1   2017-05-31            2          1
2   2017-05-31            3          0
3   2017-05-31            2          1
4   2017-05-31            3          0
5   2017-05-31            2          1

or

In [25]: pd.DataFrame([y for x in d for y in x['bookInfo']])
Out[25]:
  book_created  book_rating  book_sold
0   2017-05-31            3          0
1   2017-05-31            2          1
2   2017-05-31            3          0
3   2017-05-31            2          1
4   2017-05-31            3          0
5   2017-05-31            2          1

where d is a list of dicts, you've posted:

In [20]: d
Out[20]:
[{'bookInfo': [{'book_created': '2017-05-31',
    'book_rating': 3,
    'book_sold': 0},
   {'book_created': '2017-05-31', 'book_rating': 2, 'book_sold': 1}],
  'book_reading_speed': '4.29',
  'book_sale_date': '2017-05-31'},
 {'bookInfo': [{'book_created': '2017-05-31',
    'book_rating': 3,
    'book_sold': 0},
   {'book_created': '2017-05-31', 'book_rating': 2, 'book_sold': 1}],
  'book_reading_speed': '4.29',
  'book_sale_date': '2017-05-31'},
 {'bookInfo': [{'book_created': '2017-05-31',
    'book_rating': 3,
    'book_sold': 0},
   {'book_created': '2017-05-31', 'book_rating': 2, 'book_sold': 1}],
  'book_reading_speed': '4.29',
  'book_sale_date': '2017-05-31'}]
Sign up to request clarification or add additional context in comments.

1 Comment

I am getting a type error with the first piece of code TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.