JSON (nested) object into Pandas Dataframe

Question

I've been trying to to flatten a nested JSON object into a pandas dataframe. I've tried a number of methods but still can't seem to find a solution that works.

Json is linked here: https://www.predictit.org/api/marketdata/all/

Pandas.read_json output looks like this:

1 {'id': 1, 'name': 'Will Mark Cuban run for ...

2 {'id': 1, 'name': 'Will Andrew Cuomo run fo...

3 {'id': 2901, 'name': 'Will a woman be elected ...

4 {'id': 2902, 'name': 'Will the 2020 Democratic...

I want it to have ID, Name, etc. as columns in a pandas dataframe.

I understand that this is a fairly elementary question but I've hit a block and would appreciate any help.

Thank you.

ADDENDUM:

Here are the code bits that I am using:

This works fine:

http = urllib3.PoolManager()
r = http.request('GET', 'https://www.predictit.org/api/marketdata/all/')

Then, I've tried the following:

df_data = pandas.json_normalize(r.data)
#I've tried about a dozen different variations playing the variables passed but always get the same result or the same result transposed into a very large column and 1 row.

df_data = pandas.read_json(r.data)
#again, same is true for trying a ton of variable combinations

df_data = pandas.read_json(r.data)
df_dat = df_data.drop('markets') #and
df_dat = df_data.drop([markets])

I am now considering importing the json object using the json library then dumping into a CSV and, if the issue persists then manually removing the first Column and Row and THEN reimporting it.

Please let me know if I can provide any additional information.

Can you share the methods you've tried? Also, it's best to keep your question self-contained, so please provide a minimal reproducible example. — AMC
– AMC, Commented Mar 19, 2020 at 0:19
In my browser it's XML, yet when I get the page using requests, the result is JSON. — AMC
– AMC, Commented Mar 19, 2020 at 3:41
Update: I got it to send XML by setting the "Accept" header to "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8". — AMC
– AMC, Commented Mar 19, 2020 at 3:50

Johnny · Accepted Answer · 2020-03-19 02:35:40Z

1

You should get markets value of dict like this (And there is no need to use read_json):

import pandas as pd
import requests

pd.set_option('display.max_columns', 5)
pd.set_option('display.width', 260)
pd.set_option('mode.use_inf_as_na', True)


proxy = {"http": "http://127.0.0.1:1080", "https": "https://127.0.0.1:1080"}

r = requests.get('https://www.predictit.org/api/marketdata/all/', proxies=proxy, verify=False)

df = pd.DataFrame(r.json()['markets'])

print(df.head())

     id                                               name  ...                    timeStamp status
0  2721  Which party will win the 2020 U.S. presidentia...  ...  2020-03-18T22:23:43.4549039   Open
1  2747         Will Mark Cuban run for president in 2020?  ...  2020-03-18T22:23:43.4549039   Open
2  2875       Will Andrew Cuomo run for president in 2020?  ...  2020-03-18T22:23:43.4549039   Open
3  2901    Will a woman be elected U.S. president in 2020?  ...  2020-03-18T22:23:43.4549039   Open
4  2902  Will the 2020 Democratic nominee for president...  ...  2020-03-18T22:23:43.4549039   Open

[5 rows x 8 columns]

And the columns contracts is nested, you can use df apply to open it.

edited Mar 19, 2020 at 2:35

answered Mar 19, 2020 at 2:24

Johnny

6941 gold badge6 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

ooitzoo Over a year ago

Thanks. This is great. I am trying it now. Why do I need the proxy setting?

Johnny Over a year ago

@ooitzoo You don't need that. That's what I need. im in China

ooitzoo Over a year ago

Got any idea how to break out the nested json objects? You mentioned df.apply but I am not seeing how to do to that.

AMC Over a year ago

@ooitzoo Got any idea how to break out the nested json objects? Be careful, there really isn't such a thing as a "JSON object" here, those are just plain old Python dicts.

Johnny Over a year ago

@ooitzoo See this: pandas.pydata.org/pandas-docs/version/0.24.2/reference/api/… eg: df['key'] = df.apply(lambda x: x['key'])

|

Collectives™ on Stack Overflow

JSON (nested) object into Pandas Dataframe

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related