0

I have a nested JSON array which I have to convert to dataframe. None of the solutions shared for similar issues worked. But this JSON looks like a hard nut to crack. Please help. I want to retrieve the title and dpoin values (top":82,"left":33,"height":52,"width":675)

[
{
   "ID":"rtbg345h",
   "DataRow ID":"dgdfg45654",
   "Labeled Data":"https://abc.png",
   "Label":{
      "objects":[
         {
            "featureId":"rtbg345h",
            "schemaId":"rtbg345h",
            "title":"iris",
            "value":"flower",
            "color":"#00RGAA",
            "dpoin":{
               "top":82,
               "left":33,
               "height":52,
               "width":675
            },
            "instanceURI":"https://sdfdsf.ab"
         }
      ],
      "classifications":[

      ]
   },
   "Created By":"user",
   "Project Name":"myfirstproject",
   "Created At":"2018-02-02",
   "Updated At":"2018-02-02",
   "Seconds to Label":24.264,
   "External ID":"sds.jpg",
   "Agreement":-1,
   "Benchmark Agreement":-1,
   "Benchmark ID":null,
   "Dataset Name":"mine",
   "Reviews":[

   ],
   "View Label":"https:fdrtdf"
}
]

1 Answer 1

1

You can parse the string as JSON using json.loads():

import json

# defining the provided string
dat =  '{"ID":"rtbg345h","DataRow ID":"dgdfg45654","Labeled Data":"https://abc.png","Label":{"objects":[{"featureId":"rtbg345h","schemaId":"rtbg345h","title":"iris","value":"flower","color":"#00RGAA","dpoin":{"top":82,"left":33,"height":52,"width":675},"instanceURI":"https://sdfdsf.ab"}],"classifications":[]},"Created By":"user","Project Name":"myfirstproject","Created At":"2018-02-02","Updated At":"2018-02-02","Seconds to Label":24.264,"External ID":"sds.jpg","Agreement":-1,"Benchmark Agreement":-1,"Benchmark ID":null,"Dataset Name":"mine","Reviews":[],"View Label":"https:fdrtdf"}'

# parsing string as JSON
res = json.loads(dat) 

Here's what the parsed JSON looks like:

{'ID': 'rtbg345h',
 'DataRow ID': 'dgdfg45654',
 'Labeled Data': 'https://abc.png',
 'Label': {'objects': [{'featureId': 'rtbg345h',
    'schemaId': 'rtbg345h',
    'title': 'iris',
    'value': 'flower',
    'color': '#00RGAA',
    'dpoin': {'top': 82, 'left': 33, 'height': 52, 'width': 675},
    'instanceURI': 'https://sdfdsf.ab'}],
  'classifications': []},
 'Created By': 'user',
 'Project Name': 'myfirstproject',
 'Created At': '2018-02-02',
 'Updated At': '2018-02-02',
 'Seconds to Label': 24.264,
 'External ID': 'sds.jpg',
 'Agreement': -1,
 'Benchmark Agreement': -1,
 'Benchmark ID': None,
 'Dataset Name': 'mine',
 'Reviews': [],
 'View Label': 'https:fdrtdf'}

Then, access the elements using the appropriate notation (note that res['Label']['objects'] is a list containing a single dictionary)

title = res['Label']['objects'][0]['title']
dpoin = res['Label']['objects'][0]['dpoin']

If you have a list of JSON objects, you can parse each one individually, storing the parsed data in a list. Once you've parsed all the JSON objects, you can create a dataframe from the list.

Here's an example that keeps track of the data in a list of dictionaries:

l = [dat, dat] # repeating the obj you provided to make an example list

rows = []

for item in l: 
    res = json.loads(item)
    row = res['Label']['objects'][0]['dpoin']
    row['title'] = res['Label']['objects'][0]['title']
    rows.append(row)

df = pd.DataFrame.from_dict(rows)
df

Resulting in this dataframe:

   top  left  height  width title
0   82    33      52    675  iris
1   82    33      52    675  iris

This example creates a dictionary row for each JSON object, contaning all the fields from dpoin as well as the title. Storing the data as a list of dictionaries allows you to use the pd.DataFrame.from_dict() constructor.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.