1

I am trying to parse a JSON string to its lowest granularity to a panda dataframe.

Attempts

First I tried read_json:

jsonData = pd.read_json(apiRequest)

enter image description here

But a large chunk of the data is still nested under networkRank.

Then I tried json_normalize, but this time I am missing the data one level higher such as latitude and longitude.

result = json_normalize(json_data['networkRank'])

enter image description here

I also tried to parse "into" the nested structure and construct the data frame from scratch, but this code results in error:

result_nested = json_normalize(json_data, 'networkRank', ['longitude', 'latitude', ['networkRank', 'type3G', 'downloadSpeed']])

Goal

To parse the JSON data into a flat table with all fields, which means latitude, longitude and distance data appended to each row of data in figure 2.

JSON String

{'apiVersion': '2',
 'distance': 10,
 'latitude': '-6.162959',
 'longitude': '35.751607',
 'networkRank': [{'networkId': '6402',
   'networkName': 'Vodacom',
   'type3G': {'averageRssiAsu': '9.5429091136',
    'averageRssiDb': '-69.5664329624972',
    'downloadSpeed': '1508.1304',
    'networkId': '6402',
    'networkName': 'Vodacom',
    'networkType': '3',
    'pingTime': '320.9600',
    'reliability': '0.804236452826138',
    'sampleSizeRSSI': '948',
    'sampleSizeSpeed': '29',
    'uploadSpeed': '893.7692'}},
  {'networkId': '6400',
   'networkName': 'tiGO',
   'type3G': {'averageRssiAsu': '15.3537142857',
    'averageRssiDb': '-61.4563389583101',
    'downloadSpeed': '516.0000',
    'networkId': '6400',
    'networkName': 'tiGO',
    'networkType': '3',
    'pingTime': '259.0000',
    'reliability': '0.911904765537807',
    'sampleSizeRSSI': '935',
    'sampleSizeSpeed': '21',
    'uploadSpeed': '320.4211'}},
  {'networkId': '6403',
   'networkName': 'Airtel',
   'type3G': {'averageRssiAsu': '13.2729999375',
    'averageRssiDb': '-58.1521092977699',
    'downloadSpeed': '1080.2500',
    'networkId': '6403',
    'networkName': 'Airtel',
    'networkType': '3',
    'pingTime': '194.5556',
    'reliability': '0.554680264185345',
    'sampleSizeRSSI': '587',
    'sampleSizeSpeed': '21',
    'uploadSpeed': '572.1579'}}],
 'network_type': None,
 'perMinuteCurrent': 0,
 'perMinuteLimit': 10,
 'perMonthCurrent': 0,
 'perMonthLimit': 2000}

3 Answers 3

4

This function recursively calls itself to flatten dictionaries and lists.

from collections import OrderedDict

def flatten(json_object, container=None, name=''):
    if container is None:
        container = OrderedDict()
    if isinstance(json_object, dict):
        for key in json_object:
            flatten(json_object[key], container=container, name=name + key + '_')
    elif isinstance(json_object, list):
        for n, item in enumerate(json_object, 1):
            flatten(item, container=container, name=name + str(n) + '_')
    else:
        container[str(name[:-1])] = str(json_object)
    return container

Examples:

flatten([1, 2, 3])
OrderedDict([('1', '1'), ('2', '2'), ('3', '3')])

flatten([1, 2, 3], name='x')
OrderedDict([('x1', '1'), ('x2', '2'), ('x3', '3')])

flatten({'a': [1, 2, 3], 'b': 4, 'c': {'d': [5, 6], 'e': 7}}, name='x')
OrderedDict([('xa_1', '1'),
             ('xa_2', '2'),
             ('xa_3', '3'),
             ('xc_e', '7'),
             ('xc_d_1', '5'),
             ('xc_d_2', '6'),
             ('xb', '4')])

Response:

# j = json string
>>> pd.DataFrame(flatten(j), index=[0]).T
                                                      0
perMinuteLimit                                       10
distance                                             10
perMonthCurrent                                       0
longitude                                     35.751607
perMonthLimit                                      2000
latitude                                      -6.162959
perMinuteCurrent                                      0
networkRank_1_networkId                            6402
networkRank_1_type3G_sampleSizeSpeed                 29
networkRank_1_type3G_averageRssiAsu        9.5429091136
networkRank_1_type3G_pingTime                  320.9600
networkRank_1_type3G_networkType                      3
networkRank_1_type3G_averageRssiDb    -69.5664329624972
networkRank_1_type3G_networkName                Vodacom
networkRank_1_type3G_networkId                     6402
networkRank_1_type3G_downloadSpeed            1508.1304
networkRank_1_type3G_uploadSpeed               893.7692
networkRank_1_type3G_reliability      0.804236452826138
networkRank_1_type3G_sampleSizeRSSI                 948
networkRank_1_networkName                       Vodacom
networkRank_2_networkId                            6400
networkRank_2_type3G_sampleSizeSpeed                 21
networkRank_2_type3G_averageRssiAsu       15.3537142857
networkRank_2_type3G_pingTime                  259.0000
networkRank_2_type3G_networkType                      3
networkRank_2_type3G_averageRssiDb    -61.4563389583101
networkRank_2_type3G_networkName                   tiGO
networkRank_2_type3G_networkId                     6400
networkRank_2_type3G_downloadSpeed             516.0000
networkRank_2_type3G_uploadSpeed               320.4211
networkRank_2_type3G_reliability      0.911904765537807
networkRank_2_type3G_sampleSizeRSSI                 935
networkRank_2_networkName                          tiGO
networkRank_3_networkId                            6403
networkRank_3_type3G_sampleSizeSpeed                 21
networkRank_3_type3G_averageRssiAsu       13.2729999375
networkRank_3_type3G_pingTime                  194.5556
networkRank_3_type3G_networkType                      3
networkRank_3_type3G_averageRssiDb    -58.1521092977699
networkRank_3_type3G_networkName                 Airtel
networkRank_3_type3G_networkId                     6403
networkRank_3_type3G_downloadSpeed            1080.2500
networkRank_3_type3G_uploadSpeed               572.1579
networkRank_3_type3G_reliability      0.554680264185345
networkRank_3_type3G_sampleSizeRSSI                 587
networkRank_3_networkName                        Airtel
network_type                                       None
apiVersion                                            2
Sign up to request clarification or add additional context in comments.

Comments

0

1) Parse JSON string to python structure

2) Iterete over 'networkRank' list of dictionaries and put each key you want to add inside the hash

for data_row in deserialized_json['networkRank']:
    data_row['latitude'] = deserialized_json['latitude']
    # etc

3)

yourdataframe = pd.DataFrame( deserialized_json['networkRank'] )

Comments

0

is that what you want?

In [22]: df = json_normalize(json_data['networkRank'])

In [23]: df['distance'] = json_data['distance']

In [24]: df['latitude'] = json_data['latitude']

In [25]: df['longitude'] = json_data['longitude']

In [26]: df
Out[26]:
  networkId networkName type3G.averageRssiAsu type3G.averageRssiDb  \
0      6402     Vodacom          9.5429091136    -69.5664329624972
1      6400        tiGO         15.3537142857    -61.4563389583101
2      6403      Airtel         13.2729999375    -58.1521092977699

  type3G.downloadSpeed type3G.networkId type3G.networkName type3G.networkType  \
0            1508.1304             6402            Vodacom                  3
1             516.0000             6400               tiGO                  3
2            1080.2500             6403             Airtel                  3

  type3G.pingTime type3G.reliability type3G.sampleSizeRSSI  \
0        320.9600  0.804236452826138                   948
1        259.0000  0.911904765537807                   935
2        194.5556  0.554680264185345                   587

  type3G.sampleSizeSpeed type3G.uploadSpeed  distance   latitude  longitude
0                     29           893.7692        10  -6.162959  35.751607
1                     21           320.4211        10  -6.162959  35.751607
2                     21           572.1579        10  -6.162959  35.751607

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.