Nested Json in to dataframe (pandas)

Question

I`m importing the following json. It is not super extensive but I just added a small piece here for demonstration:

{'request': {'Target': 'Offer',
  'Format': 'json',
  'Service': 'XXXXX',
  'Version': '2',
  'NetworkToken': 'uu238939871',
  'Method': 'findAll',
  'fields': ['preview_url',
   'sub1',
   'sub2',
   'sub3',
   'sub4',
   'sub5',
   'sub6',
   'sub7',
   'sub8',
   'sub9',
   'sub10',
   'sub11',
   'sub12'],
  'filters': {'status': 'active', 'advertiser_id': '626'}},
 'response': {'status': 1,
  'httpStatus': 200,
  'data': {'231': {'Offer': {'sub1': '231',
     'sub2': '626',
     'sub3': 'XXXXXXXX',
     'sub4': 'XXXXXXX',
     'sub5': 'https://play.google.com/store/apps',
     'sub6': 'https://XXXXX',
     'sub7': '0.39',
     'sub8': 'active',
     'sub9': 'None',
     'sub10': '0',
     'sub11': '0',
     'sub12': '1',
     'sub13': 'XXXXX'}},
   '216': {'Offer': {'sub1': '216',
     'sub2': '626',
     'sub3': 'XXXXXXXX',
     'sub4': '682666',
     'sub5': 'https://play.google.com/store/appsg',
     'sub6': 'https://XXXXXXX',
     'sub7': '1.96000',
     'sub8': 'active',
     'sub9': 'None',
     'sub10': '0',
     'sub11': '0',
     'sub12': '1',
     'sub13': 'XXXXX'}}
 'errors': [],
  'errorMessage': None}}

Im trying to get this into a data frame by using df = pd.json_normalize(data) however the only thing Im getting is a df with 1 row and more than 400 columns, which is not ideal. I also tried to use flatten but got an error where it just says 'Data'

from flatten_json import flatten
dic_flattened = (flatten(d, '.') for d in data['data'])
df = pd.DataFrame(dic_flattened)

The final data frame should have each sub in a column and the fields in rows, for example:

Sub1	Sub2	Sub3
231	626	XXXXXXXX
216	626	XXXXXXXX

It was just a representation. Basically have all subs into columns and all the data in rows. I fixed it for better visualization — Mariana Rodrigues
– Mariana Rodrigues, Commented Apr 26, 2022 at 18:05

user7864386 · Accepted Answer · 2022-04-26 18:07:52Z

1

It's simpler if you reformat the dictionary in Python first before constructing the DataFrame. Since you want the Sub keys, you can simply extract them using a list comprehension:

out = pd.DataFrame([v['Offer'] for v in my_data['response']['data'].values() if isinstance(v, dict)])

Output:

  sub1 sub2      sub3     sub4                                 sub5             sub6     sub7    sub8  sub9 sub10 sub11 sub12  sub13
0  231  626  XXXXXXXX  XXXXXXX   https://play.google.com/store/apps    https://XXXXX     0.39  active  None     0     0     1  XXXXX
1  216  626  XXXXXXXX   682666  https://play.google.com/store/appsg  https://XXXXXXX  1.96000  active  None     0     0     1  XXXXX

answered Apr 26, 2022 at 18:07

user7864386

Sign up to request clarification or add additional context in comments.

2 Comments

Mariana Rodrigues Over a year ago

Thank you, that helped a lot. I did try to do something similar to that but I don`t know what isinstance does

user7864386 Over a year ago

@MarianaRodrigues isinstance checks for type. In this case, it's checking for dict type since the desired items exist only in dictionaries.

Collectives™ on Stack Overflow

Nested Json in to dataframe (pandas)

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related