3

I have spent hours trying to un-nest columns in my dataframe coming from a json file and still could not make it work.

I have queried a website using GraphQl and loaded the response into variable json:

json = resp.json()

Next, I loaded the data into a dataframe using json_normalize:

df = pd.DataFrame.from_dict(json_normalize(resp.json()), orient='columns')

I renamed the columns.

However, there were still nested columns within the dataframe - namely 'rules' and 'floors': enter image description here enter image description here

I then tried un-nestling the columns through several approaches I have seen here on stackoverflow but also elsewhere on the internet:

I tried the nested column json_normalize in different versions and also with metadata, but none of the ways of calling the specific values worked:

  json_normalize(json, ['floors', ['units'],['features']])

and this:

 json_normalize(data=json, record_path=['floors', 'units','features'])

In most cases however, I got TypeError: string indices must be integers.

I tried assigning the values separately to columns but this failed for cases where some of these were NUll

df['pets allowed'] = json['data']['offerAggregate']['property_aggregate']['property']['rules']['code' == 'pets-allowed']['exists']

I also tried splitting the columns by key words such as 'Code' but this only returned Null

Optimally, I would like to make option #1 work, but I truly tried so many versions and still do not have a result, since I am not sure how to appropriately define the path to the nested column.

Here's the full schema:

{'data': {'offerAggregate': {'accommodation_offer': {'contract': {'type': 'fortnight',
     'exclusive': False,
     'is_instant_booking': False,
     'commission': 0.08,
     'deposit': {'pay_to': 'accommodation-provider',
      'type': 'equal-to-first-payment',
      'value': {'amount': 0, 'currency_code': ''}},
     'admin_fee': {'exact_value': True,
      'value': {'amount': 0, 'currency_code': 'EUR'}},
     'fixed_unitary': {'extra_per_guest': {'amount': 0, 'currency_code': ''}}},
    'reference_price': {'amount': '25000', 'currency_code': 'EUR'},
    'requisites': {'conditions': {'cancellation_policy': 'moderate',
      'minimum_nights': 27,
      'max_guests': 2}},
    'costs': {'bills': {'water': {'included': True},
      'electricity': {'included': True},
      'gas': {'included': True},
      'internet': {'included': True}},
     'services': {'cleaning': {'periodicity': 'weekly'}}}},
   'accommodation_provider': {'stats': {'bookings': {'accepted': {'total': 2},
      'requested': {'total': 10},
      'rejected': {'total': 1},
      'confirmed': {'total': 0}}},
    'created': {'at': '2018-11-02 16:51:22'}},
   'property_aggregate': {'property': {'id': '114087',
     'landlord_resident': {'gender': '', 'age_range': '', 'occupation': ''},
     'floors': [{'units': [{'features': [{'Code': 'fridge', 'Exists': True},
          {'Code': 'freezer', 'Exists': True},
          {'Code': 'oven', 'Exists': True},
          {'Code': 'stove', 'Exists': True},
          {'Code': 'washing-machine', 'Exists': True},
          {'Code': 'window', 'Exists': True},
          {'Code': 'balcony', 'Exists': False},
          {'Code': 'table', 'Exists': True},
          {'Code': 'chairs', 'Exists': True}]},
        {'features': [{'Code': 'bathtub', 'Exists': False},
          {'Code': 'shower', 'Exists': True},
          {'Code': 'sink', 'Exists': True},
          {'Code': 'toilet', 'Exists': True},
          {'Code': 'window', 'Exists': True}]},
        {'features': [{'Code': 'wardrobe', 'Exists': True},
          {'Code': 'chest-of-drawers', 'Exists': False},
          {'Code': 'desk', 'Exists': True},
          {'Code': 'chairs', 'Exists': True},
          {'Code': 'sofa', 'Exists': False},
          {'Code': 'sofa-bed', 'Exists': False},
          {'Code': 'window', 'Exists': True},
          {'Code': 'balcony', 'Exists': False},
          {'Code': 'tv', 'Exists': False},
          {'Code': 'lock', 'Exists': True}]},
        {'features': [{'Code': 'wardrobe', 'Exists': True},
          {'Code': 'chest-of-drawers', 'Exists': False},
          {'Code': 'desk', 'Exists': True},
          {'Code': 'chairs', 'Exists': True},
          {'Code': 'sofa', 'Exists': False},
          {'Code': 'sofa-bed', 'Exists': False},
          {'Code': 'window', 'Exists': True},
          {'Code': 'balcony', 'Exists': True},
          {'Code': 'tv', 'Exists': False},
          {'Code': 'lock', 'Exists': True}]},
        {'features': [{'Code': 'wardrobe', 'Exists': True},
          {'Code': 'chest-of-drawers', 'Exists': False},
          {'Code': 'desk', 'Exists': False},
          {'Code': 'chairs', 'Exists': False},
          {'Code': 'sofa', 'Exists': False},
          {'Code': 'sofa-bed', 'Exists': False},
          {'Code': 'window', 'Exists': True},
          {'Code': 'balcony', 'Exists': False},
          {'Code': 'tv', 'Exists': False},
          {'Code': 'lock', 'Exists': True}]},
        {'features': [{'Code': 'wardrobe', 'Exists': True},
          {'Code': 'chest-of-drawers', 'Exists': False},
          {'Code': 'desk', 'Exists': False},
          {'Code': 'chairs', 'Exists': False},
          {'Code': 'sofa', 'Exists': False},
          {'Code': 'sofa-bed', 'Exists': False},
          {'Code': 'window', 'Exists': True},
          {'Code': 'balcony', 'Exists': True},
          {'Code': 'tv', 'Exists': False},
          {'Code': 'lock', 'Exists': True}]}]}],
     'rules': [{'code': 'overnight-guests-allowed', 'exists': False},
      {'code': 'pets-allowed', 'exists': False},
      {'code': 'smoking-allowed', 'exists': False}],
     'typology': {'area': 0,
      'accommodation_type_code': 'private',
      'type_code': 'apartment',
      'number_of_bedrooms': 4,
      'number_of_bathrooms': 1},
     'location': {'neighborhood_id': 229,
      'geo': {'latitude': 38.7514768, 'longitude': -9.2031683},
      'address': {'postal_code': '1500-109'}},
     'verification': {'verified': True}}}}}}

Thank you for your time in advance! Any help is highly appreciated!

3
  • 1
    I think that you really want to use pd.read_json instead: pandas.pydata.org/pandas-docs/stable/generated/… Commented Nov 6, 2018 at 18:10
  • Hey kevin, I tried using pd.read_json, but I couldn't apply it since my object is a Python dict. I did not find a way to transform the request response to a true JSON. The command resp.json() seems to have transformed it to a Python object. Commented Nov 6, 2018 at 18:30
  • This then returns me a dataframe with 2 columns: 'offerAggregate' in the 'index' column and the whole rest of the dictionary in the 'data' column Commented Nov 6, 2018 at 18:46

1 Answer 1

2

The reason why json_normalize is stopping at floors and rules is because they contain lists instead of dictionaries, which is what json_normalize is waiting for.

To normalize this json you will need to convert those lists to dictionary like structures. So for example for rules instead of this structure:

[{'code': 'overnight-guests-allowed', 'exists': False},
  {'code': 'pets-allowed', 'exists': False},
  {'code': 'smoking-allowed', 'exists': False}]

You will want this structure:

{'overnight-guests-allowed': False,
 'pets-allowed': False},
 'smoking-allowed': False}
Sign up to request clarification or add additional context in comments.

1 Comment

@HannahKorts That's great happy to hear!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.