2

I have a nested json, but I can't understand how to work with them.

{
    "return": {
        "status_processing": "3",
        "status": "OK",
        "order": {
            "id": "872102042",
            "number": "123831",
            "date_order": "dd/mm/yyyy",
            "items": [
                {
                    "item": {
                        "id_product": "684451795",
                        "code": "VPOR",
                        "description": "Product 1",
                        "unit": "Un",
                        "quantity": "1.00",
                        "value": "31.76"
                    }
                },
                {
                    "item": {
                        "id_product": "684451091",
                        "code": "VSAP",
                        "description": "Product 2",
                        "unit": "Un",
                        "quantity": "1.00",
                        "value": "31.76"
                    }
                }
            ]
        }
    }
}

I searched on stackoverflow questions, and try some resolutions that people passed, but don't work for me.

Here an sample that I used to accessing the data from json:

df = pd.json_normalize(
    order_list,
    record_path=["return", "order", "itens"],
    meta=[
        ["return", "order", "id"],
        ["return", "order", "date_order"],
        ["return", "order", "number"],
    ],
)

But don't work, they duplicating the data when I send to dataframe.

Anyone can help me?

EDIT

Here an example that I used:

Convert nested JSON to pandas DataFrame

And what I expected:

enter image description here

7
  • if you flatten it then it may have to repeat some data. What other solutions did you try? You could add links in question (not in comments). What result do you expect? You could show it in question. It could explain what you really need. Commented Mar 30 at 14:09
  • maybe it would be simpler to write normal code instead of using json_normalize Commented Mar 30 at 14:10
  • normal code you saying create a for loop ? Commented Mar 30 at 14:21
  • first you should show expected result. If you want every item in new row then it may need to use for-loop or expand instead normalize Commented Mar 30 at 14:22
  • What I want is every item in new row. And thank you, I'll try using expand Commented Mar 30 at 14:27

2 Answers 2

1

You're code is fine. You are getting the data, perhaps you just wanted to specify which columns to keep (or maybe rename)?

import json
import pandas as pd



data = '''   { "return": {
        "status_processing": "3",
        "status": "OK",
        "order": {
            "id": "872102042",
            "number": "123831",
            "date_order": "dd/mm/yyyy",
             "itens": [
                {
                    "item": {
                        "id_product": "684451795",
                        "code": "VPOR",
                        "description": "Product 1",
                        "unit": "Un",
                        "quantity": "1.00",
                        "value": "31.76"
                    }
                },
                {
                    "item": {
                        "id_product": "684451091",
                        "code": "VSAP",
                        "description": "Product 2",
                        "unit": "Un",
                        "quantity": "1.00",
                        "value": "31.76"
                    }
                }
            ]
        }
    }
}'''

order_list = json.loads(data)


df = pd.json_normalize(order_list, 
                       record_path= ["return", "order", "itens"],
                       meta=[["return", "order", "id"], ["return", "order", "number"], ["return", "order", "date_order"]])



df = df[['return.order.id', 'return.order.number', 'return.order.date_order', 'item.id_product']]

Output:

print(df)
  return.order.id return.order.number return.order.date_order item.description
0       872102042              123831              dd/mm/yyyy        Product 1
1       872102042              123831              dd/mm/yyyy        Product 2
Sign up to request clarification or add additional context in comments.

Comments

0

I don't know what exactly you expect in output but if you want every item in new row then you could use normal code with for-loop for this.

order_list = {
    "return": {
        "status_processing": "3",
        "status": "OK",
        "order": {
            "id": "872102042",
            "number": "123831",
            "date_order": "dd/mm/yyyy",
             "itens": [
                {
                    "item": {
                        "id_product": "684451795",
                        "code": "VPOR",
                        "description": "Product 1",
                        "unit": "Un",
                        "quantity": "1.00",
                        "value": "31.76"
                    }
                },
                {
                    "item": {
                        "id_product": "684451091",
                        "code": "VSAP",
                        "description": "Product 2",
                        "unit": "Un",
                        "quantity": "1.00",
                        "value": "31.76"
                    }
                }
            ]
        }
    }
}

import pandas as pd

data = []

order = order_list['return']['order']

for iten in order['itens']:
    for key, val in iten.items():
        row = {
            #'key': key, 
            'id': order['id'], 
            'date_order': order['date_order'], 
            'number': order['number'], 
            'id_product': val['id_product'],
            #'code': val['code'],
            #'description': val['description'],
            #'quantity': val['quantity'],
            #'value': val['value'],
        }
        data.append(row)

df = pd.DataFrame(data)
print(df)

Result:

          id  date_order  number id_product
0  872102042  dd/mm/yyyy  123831  684451795
1  872102042  dd/mm/yyyy  123831  684451091

If you need other information in rows then you should show it in question.

7 Comments

It's works, but have some ids that have two itens, and here for key, val in order['itens'][0].items(): it's getting the first item, how I use slicing to get all data ? I Tried using [:] to get all, but dont work
if you have two items then (first) you should show it in JSON in question, (second) you can use external for-loop to work with every 'itens' (but this depends on how exactly looks JSON data). BTW: if you show example JSON then show result exactly for this JSON data - it helps to see what is moved to rows.
I updated the json on description @furas
as you say, I added a for-loop external and works. Thank you for your help @furas
I added code with external for-loop
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.