I have a dataframe(df1) containing two columns.
id information
00100 {'DriversList': {'ProblematicDrivers': [], 'In...
00200 {'DriversList': {'ProblematicDrivers': [], 'In...
The information column contains nested json object, which needs to be converted into dataFrame, and associate the same with ID.
df1['information'] column's json --
'DriversList': {
'ProblematicDrivers': [
],
'InstalledDrivers': [
{
'DriverName': 'FaxMachine',
'DisplayName': 'Fax',
'Version': '10',
'Date': '06-21-2006'
},
{
'DriverName': 'FaxMachine',
'DisplayName': 'Fax',
'Version': '10',
'Date': '06-21-2006'
}
]
}
}
My code so far:
df2 = pd.DataFRame()
data = json_normalize(data = df1['information'])
for x in data['DriversList.InstalledDrivers']:
df2 = df2.append(x)
The number of records in information column will be associated with the ID, which is present in original dataframe(df1)
For example -- For first row, as information column contains 2 records for InstalledDrivers, the final output will have 00100 associated with 2 rows.
Expected OutPut --
id Date DriverName DisplayName Version
00100 06-21-2006 FaxMachine Fax 10
00100 06-21-2006 FaxMachine Fax 10
00200 06-21-2006 FaxMachine Fax 10
00200 06-21-2006 FaxMachine Fax 10
Any suitable approach which can be handle on dataFrame level only. I've also tried JSON_Normalize but unable to load this JSON into dataframe. Is it possible to do it using JSON Normalize or is there any other optimized solution available. And also not able to associate id with the converted dataframe.