how do you extract some data from json file using python

Question

I have this json file:

print(data)

{'entityId': 'clusterId123', 
'displayName': 'dev_cluster', 
'firstSeenTms': 1584113406351, 
'lastSeenTms': 1627524312116,  
'properties': {'detectedName': 'dev_cluster'}, 
'tags': [], 
'icon': {'primaryIconType': 'hypervisor'}, 
'toRelationships': {
    'isMemberOf': [
        {'id': 'HYPERVISOR_123', 'type': 'HYPERVISOR'}, 
        {'id': 'HYPERVISOR_234', 'type': 'HYPERVISOR'}, 
        {'id': 'HYPERVISOR_345', 'type': 'HYPERVISOR'}
        ]
    }
}

I need to create a data frame that looks like this:

clusterId,  clusterName, hypervisorId
clusterId123 dev_cluster HYPERVISOR_123
clusterId123 dev_cluster HYPERVISOR_234
clusterId123 dev_cluster HYPERVISOR_345

as you can see clusterId and clusterName repeats but the hypervisorId changes just like in the data file.

I am doing this:

#create an empty list
 `cluList=[]`
#apend elements to the list
`cluList.append([data['entityId'], data['displayName']])`

I dont know how to pull the HYPERVISOR_123, HYPERVISOR_234, HYPERVISOR_345 from this data sets. Any guidance appreciated.

That's not JSON, it's just a Python dictionary

Barmar
– Barmar

2021-07-29 02:55:48 +00:00
Commented Jul 29, 2021 at 2:55 — Barmar
– Barmar, Commented Jul 29, 2021 at 2:55

jizhihaoSAMA · Accepted Answer · 2021-07-29 03:01:50Z

1

Use dict comprehesion:

import pandas as pd

data = {'entityId': 'clusterId123', 
'displayName': 'dev_cluster', 
'firstSeenTms': 1584113406351, 
'lastSeenTms': 1627524312116,  
'properties': {'detectedName': 'dev_cluster'}, 
'tags': [], 
'icon': {'primaryIconType': 'hypervisor'}, 
'toRelationships': {
    'isMemberOf': [
        {'id': 'HYPERVISOR_123', 'type': 'HYPERVISOR'}, 
        {'id': 'HYPERVISOR_234', 'type': 'HYPERVISOR'}, 
        {'id': 'HYPERVISOR_345', 'type': 'HYPERVISOR'}
        ]
    }
}

df = pd.DataFrame({
    "clusterId": data["entityId"],
    "clusterName": data["displayName"],
    "hypervisorId": _id["id"]
} for _id in data["toRelationships"]["isMemberOf"])

df

And result:

    clusterId       clusterName   hypervisorId
0   clusterId123    dev_cluster   HYPERVISOR_123
1   clusterId123    dev_cluster   HYPERVISOR_234
2   clusterId123    dev_cluster   HYPERVISOR_345

answered Jul 29, 2021 at 3:01

jizhihaoSAMA

12.7k9 gold badges32 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user1471980 Over a year ago

thank you for the answer. Quick question, if I had a loop, bunch of data files, do I have to append to the empty data frame?

jizhihaoSAMA Over a year ago

@user1471980 data files? Could you be more specific?

Barmar · Accepted Answer · 2021-07-29 03:00:15Z

1

You can use a list comprehension that loops over the isMemberOf list.

clulist = [(data['entityId'], data['displayName'], hypid) 
            for hypid in data['toRelationships']['isMemberOf']
        ]

answered Jul 29, 2021 at 3:00

Barmar

789k57 gold badges554 silver badges669 bronze badges

2 Comments

user1471980 Over a year ago

clulist would be a type list? can I do this to convert to data frame (cluEntitydf=pd.DataFrame(cluList)?

Barmar Over a year ago

Yes, you should be able to do that.

Collectives™ on Stack Overflow

how do you extract some data from json file using python

2 Answers 2

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related