3

I'm new to Python and was trying to extract out some nested data.

Here is the JSON for two products. A product can belong to zero or more categories

 {  
   "Item":[  
      {   
         "ID":"170",
         "InventoryID":"170",
         "Categories":[  
            {  
               "Category":[  
                  {  
                    "CategoryID":"444",
                    "Priority":"0",
                    "CategoryName":"Paper Mache"
                  },
                  {  
                     "CategoryID":"479",
                     "Priority":"0",
                     "CategoryName":"Paper Mache"
                  },
                  {  
                     "CategoryID":"515",
                     "Priority":"0",
                     "CategoryName":"Paper Mache"
                  }
               ]
            }
         ],
         "Description":"Approximately 9cm wide x 4cm deep.",
         "SKU":"111931"
      },
      {  
         "ID":"174",
         "InventoryID":"174",
     "    Categories":[  
            {  
                "Category":{  
                  "CategoryID":"888",
                  "Priority":"0",
                  "CategoryName":"Plaster"
                }
            }
         ],
         "Description":"Plaster Mould - Australian Animals",
         "SKU":"110546"
      }
   ],
   "CurrentTime":"2016-08-22 11:52:27",
   "Ack":"Success"
}

I want to work out which Categories a product belongs to.

My code for extraction is as follows:-

        for x in products: 
            productsInCategory = []
            for y in x['Categories']:
                for z in y['Category']:
                    if z['CategoryID'] == categories[i]['CategoryID']:
                        productsInCategory.append(x)

This issue is that in this case the second item only contains one Category, not an array of categories so this line

for z in y['Category']:

loops through the properties of a Category and not a Category array and hence causes my code to fail.

How can I protect against this? And can this be written more elegantly with list comprehension syntax?

2 Answers 2

4

That's a very poor document structure in that case; you shouldn't have to deal with this. If an item can contain multiple values, it should always be a list.

Be that as it may, you can still deal with it in your code by checking if it is a list or not.

for x in products: 
    productsInCategory = []
    for y in x['Categories']:
        category = y['Category']
        if isinstance(category, dict):
            category = [category]
        for z in category:
            ...

(You might want to consider using more descriptive variable names generally; x, y and z are not very helpful for people reading the code.)

Sign up to request clarification or add additional context in comments.

Comments

1

I've run into this issue frequently before in JSON structures...frequently enough that I wrote a small library for it a few weeks ago...

nested key retriever (nkr)

Try the generator and see if it solves your problem. You should be able to simple:

for x in products: 
    if product_id_searching_for in list(nkr.find_nested_key_values(x, 'CategoryID')):
         productsInCategory.append(x)

4 Comments

An example would be nice. Until than it's more a commercial.
Commercials for free products, while fascinating, are probably a waste of time/effort. I've had this issue before, I solved it for myself. If that code is useful to the OP then I'm glad to have helped
But on that note, I've added a rough example usage for OP
That's fine. Thanks a lot. This adds value to the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.