Create a data frame from a complex nested dictionary?

Question

I have a big nested, then nested then nested json file saved as .txt format. I need to access some specific key pairs and crate a data frame or another transformed json object for further use. Here is a small sample with 2 key pairs.

[
  {
"ko_id": [819752],
"concepts": [
  {
    "id": ["11A71731B880:http://ontology.intranet.com/Taxonomy/116@en"],
    "uri": ["http://ontology.intranet.com/Taxonomy/116"],
    "language": ["en"],
    "prefLabel": ["Client coverage & relationship management"]
  }
]
  },
  {
"ko_id": [819753],
"concepts": [
  {
    "id": ["11A71731B880:http://ontology.intranet.com/Taxonomy/116@en"],
    "uri": ["http://ontology.intranet.com/Taxonomy/116"],
    "language": ["en"],
    "prefLabel": ["Client coverage & relationship management"]
     }
   ]
 }
]

The following code load the data as list but I need to access to the data probably as a dictionary and I need the "ko_id", "uri" and "prefLabel" from each key pair and put it to a pandas data frame or a dictionary for further analysis.

with open('sample_data.txt') as data_file:    
   json_sample = js.load(data_file)

The following code gives me the exact value of the first element. But donot actually know how to put it together and build the ultimate algorithm to create the dataframe.

print(sample_dict["ko_id"][0])
print(sample_dict["concepts"][0]["prefLabel"][0])
print(sample_dict["concepts"][0]["uri"][0])

user8834780 · Accepted Answer · 2017-11-27 22:01:17Z

2

for record in sample_dict:
    df = pd.DataFrame(record['concepts']) 
    df['ko_id'] = record['ko_id']
    final_df = final_df.append(df)

answered Nov 27, 2017 at 22:01

user8834780

1,6805 gold badges25 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

FJSevilla · Accepted Answer · 2017-11-27 22:33:35Z

2

You can pass the data to pandas.DataFrame using a generator:

import pandas as pd
import json as js

with open('sample_data.txt') as data_file:    
   json_sample = js.load(data_file)

df = pd.DataFrame(data = ((key["ko_id"][0],
                           key["concepts"][0]["prefLabel"][0],
                           key["concepts"][0]["uri"][0]) for key in json_sample),  
                  columns = ("ko_id", "prefLabel", "uri"))

Output:

>>> df

    ko_id                                  prefLabel                                        uri
0  819752  Client coverage & relationship management  http://ontology.intranet.com/Taxonomy/116   
1  819753  Client coverage & relationship management  http://ontology.intranet.com/Taxonomy/116

answered Nov 27, 2017 at 22:33

FJSevilla

4,5431 gold badge16 silver badges21 bronze badges

5 Comments

DataPsycho Over a year ago

@FJ may be there is some problem in the "uri" wthen i run the following code it gives me error in the main data. for key in data_dict: print(key["concepts"][0]["uri"][0]) It shows list index out of range. I mean probably there is some missing/empty field in the main data.

FJSevilla Over a year ago

@DataPsycho data_dict is the json (json_sample in my code)? What is the exception?

DataPsycho Over a year ago

@FJ oh sorry data_dict is the full version of json_sample. So it has same structure but with the full data.

DataPsycho Over a year ago

I lode the big file and then run your code it gives me following error. with open('output_json_20171031.json') as data_file: data_dict = js.load(data_file)

FJSevilla Over a year ago

@DataPsycho contents or uri are empty in some fields, you can try list slicing:

df = pd.DataFrame(data = ((*key["ko_id"][0:1],                            *key["concepts"][0]["prefLabel"][0:1],                            *key["concepts"][0]["uri"][0:1]) for key in json_sample),                     columns = ("ko_id", "prefLabel", "uri"))

Would it be possible for you to share the entire file using Google Drive, DropBox, etc? It would be much easier to give you a possible solution.

Collectives™ on Stack Overflow

Create a data frame from a complex nested dictionary?

2 Answers 2

Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related