Convert nested JSON to CSV in Python 3.7

Question

I'm new to python and I'm currently trying to convert my json file into a csv file and into a desired format

I have tried looking at other examples and questions but still cannot find the answer.

Current python code

import json
import pandas as pd

df=pd.read_json("unifi.json")
print(df)
df.to_csv('results2.csv')
print("success")

Json data

[
    {
        "by_app": [
            {
                "app": 5,
                "cat": 3,
                "clients": [
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 372,
                        "rx_packets": 3,
                        "tx_bytes": 1361,
                        "tx_packets": 4
                    },
                    {
                        "mac": "f0:9f:c2:6c:5b:d0",
                        "rx_bytes": 56896,
                        "rx_packets": 191,
                        "tx_bytes": 210622,
                        "tx_packets": 460
                    }
                ],
                "known_clients": 2,
                "rx_bytes": 60837,
                "rx_packets": 203,
                "tx_bytes": 213435,
                "tx_packets": 475
            },
            {
                "app": 94,
                "cat": 19,
                "clients": [
                    {
                        "mac": "30:07:4d:38:ae:e2",
                        "rx_bytes": 64654,
                        "rx_packets": 147,
                        "tx_bytes": 19533,
                        "tx_packets": 138
                    },
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 42416,
                        "rx_packets": 68,
                        "tx_bytes": 12419,
                        "tx_packets": 74
                    }
                ],
                "known_clients": 2,
                "rx_bytes": 5421117,
                "rx_packets": 4779,
                "tx_bytes": 243979,
                "tx_packets": 2377
            },
            {
                "app": 162,
                "cat": 20,
                "rx_bytes": 3295298,
                "rx_packets": 2935,
                "tx_bytes": 171266,
                "tx_packets": 2032
            },
            {
                "app": 209,
                "cat": 13,
                "rx_bytes": 21763,
                "rx_packets": 38,
                "tx_bytes": 4433,
                "tx_packets": 30
            },
            {
                "app": 222,
                "cat": 13,
                "clients": [
                    {
                        "mac": "30:07:4d:38:ae:e2",
                        "rx_bytes": 300,
                        "rx_packets": 3,
                        "tx_bytes": 503,
                        "tx_packets": 4
                    },
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 3452,
                        "rx_packets": 24,
                        "tx_bytes": 4176,
                        "tx_packets": 26
                    },
                    {
                        "mac": "f0:9f:c2:6c:5b:d0",
                        "rx_bytes": 0,
                        "rx_packets": 0,
                        "tx_bytes": 396,
                        "tx_packets": 6
                    }
                ],
                "known_clients": 3,
                "rx_bytes": 4742,
                "rx_packets": 32,
                "tx_bytes": 5787,
                "tx_packets": 42
            },
            {
                "app": 167,
                "cat": 20,
                "clients": [
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 5761,
                        "rx_packets": 14,
                        "tx_bytes": 1574,
                        "tx_packets": 13
                    }
                ],
                "known_clients": 1,
                "rx_bytes": 138686,
                "rx_packets": 237,
                "tx_bytes": 31821,
                "tx_packets": 155
            },
            {
                "app": 112,
                "cat": 4,
                "clients": [
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 135381,
                        "rx_packets": 161,
                        "tx_bytes": 42596,
                        "tx_packets": 140
                    }
                ],
                "known_clients": 1,
                "rx_bytes": 135381,
                "rx_packets": 161,
                "tx_bytes": 42596,
                "tx_packets": 140
            },
            {
                "app": 62,
                "cat": 8,
                "rx_bytes": 7219,
                "rx_packets": 10,
                "tx_bytes": 1153,
                "tx_packets": 9
            },
            {
                "app": 185,
                "cat": 20,
                "rx_bytes": 4733026,
                "rx_packets": 4666,
                "tx_bytes": 728026,
                "tx_packets": 2688
            },
            {
                "app": 130,
                "cat": 4,
                "clients": [
                    {
                        "mac": "30:07:4d:38:ae:e2",
                        "rx_bytes": 113871,
                        "rx_packets": 121,
                        "tx_bytes": 16442,
                        "tx_packets": 116
                    }
                ],
                "known_clients": 1,
                "rx_bytes": 113871,
                "rx_packets": 121,
                "tx_bytes": 16442,
                "tx_packets": 116
            },
            {
                "app": 65535,
                "cat": 255,
                "clients": [
                    {
                        "mac": "30:07:4d:38:ae:e2",
                        "rx_bytes": 8195,
                        "rx_packets": 27,
                        "tx_bytes": 3834,
                        "tx_packets": 25
                    },
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 27338,
                        "rx_packets": 93,
                        "tx_bytes": 11330,
                        "tx_packets": 86
                    },
                    {
                        "mac": "f0:9f:c2:6c:5b:d0",
                        "rx_bytes": 19974,
                        "rx_packets": 181,
                        "tx_bytes": 2447,
                        "tx_packets": 34
                    },
                    {
                        "mac": "f0:9f:c2:c6:63:5a",
                        "rx_bytes": 90,
                        "rx_packets": 1,
                        "tx_bytes": 0,
                        "tx_packets": 0
                    }
                ],
                "known_clients": 4,
                "rx_bytes": 6417768,
                "rx_packets": 10254,
                "tx_bytes": 1193280,
                "tx_packets": 7326
            },
            {
                "app": 190,
                "cat": 13,
                "clients": [
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 25041,
                        "rx_packets": 34,
                        "tx_bytes": 32420,
                        "tx_packets": 49
                    }
                ],
                "known_clients": 1,
                "rx_bytes": 25041,
                "rx_packets": 34,
                "tx_bytes": 32420,
                "tx_packets": 49
            },
            {
                "app": 106,
                "cat": 18,
                "clients": [
                    {
                        "mac": "f0:9f:c2:6c:5b:d0",
                        "rx_bytes": 360,
                        "rx_packets": 4,
                        "tx_bytes": 360,
                        "tx_packets": 4
                    },
                    {
                        "mac": "f0:9f:c2:c6:63:5a",
                        "rx_bytes": 90,
                        "rx_packets": 1,
                        "tx_bytes": 180,
                        "tx_packets": 2
                    }
                ],
                "known_clients": 2,
                "rx_bytes": 450,
                "rx_packets": 5,
                "tx_bytes": 540,
                "tx_packets": 6
            },
            {
                "app": 1,
                "cat": 6,
                "rx_bytes": 23370,
                "rx_packets": 35,
                "tx_bytes": 4388,
                "tx_packets": 26
            },
            {
                "app": 61,
                "cat": 9,
                "rx_bytes": 4825,
                "rx_packets": 24,
                "tx_bytes": 2040,
                "tx_packets": 24
            },
            {
                "app": 32,
                "cat": 17,
                "rx_bytes": 27068,
                "rx_packets": 42,
                "tx_bytes": 6002,
                "tx_packets": 27
            },
            {
                "app": 3,
                "cat": 24,
                "clients": [
                    {
                        "mac": "30:07:4d:38:ae:e2",
                        "rx_bytes": 3791,
                        "rx_packets": 8,
                        "tx_bytes": 1258,
                        "tx_packets": 9
                    },
                    {
                        "mac": "ec:1f:72:fa:75:77",
                        "rx_bytes": 25745,
                        "rx_packets": 109,
                        "tx_bytes": 21603,
                        "tx_packets": 104
                    }
                ],
                "known_clients": 2,
                "rx_bytes": 29536,
                "rx_packets": 117,
                "tx_bytes": 22861,
                "tx_packets": 113
            },
            {
                "app": 63,
                "cat": 18,
                "rx_bytes": 0,
                "rx_packets": 0,
                "tx_bytes": 114,
                "tx_packets": 2
            },
            {
                "app": 21,
                "cat": 3,
                "rx_bytes": 41992,
                "rx_packets": 53,
                "tx_bytes": 9788,
                "tx_packets": 34
            },
            {
                "app": 21,
                "cat": 14,
                "rx_bytes": 31920,
                "rx_packets": 114,
                "tx_bytes": 19203,
                "tx_packets": 82
            }
        ]
    }
]

Current output is

|by app                  
|[{'app': 5, 'cat': 3,...

Expected output is

|app |cat |rx_byte|rx_packets|....
|5   |3   |60837  |203       |....

Im trying to get the columns to be separated into individual columns instead of lumping them into 1 column

your json object has three keys at top level, csv file require 2dimensional data but you have multiple dimensions. First you have 3 root level keys, then every key has its list of objects. So how exactly your csv should look like with this data? — Numan Ijaz
– Numan Ijaz, Commented May 7, 2019 at 8:54
i'm just trying to split everything into a column each with their similar headers like a database table — User9923456
– User9923456, Commented May 7, 2019 at 8:58
maybe checkout the following link to flatten your json first and then use datafram to_csv method of pandas: — Numan Ijaz
– Numan Ijaz, Commented May 7, 2019 at 9:17

AKX · Accepted Answer · 2019-05-07 10:55:36Z

0

You don't need Pandas for this...

import json
import sys
import csv

# Read into a dict (grab the first entry from the list)
with open("unifi.json", "r") as infp:
    data = json.load(infp)[0]

# The keys we want per-row
keys = [
    "app",
    "cat",
    "known_clients",
    "rx_bytes",
    "rx_packets",
    "tx_bytes",
    "tx_packets",
]

# Read into a list of dicts, substituting None for nonexistent data
rows = [
    {key: datum.get(key) for key in keys}
    for datum in data["by_app"]
]

# (If you want, you can do things with the list-of-dicts here)
# print(rows)

# Create a CSV writer for the standard output
# (or to write to a file, `csv.writer(open('file.csv', 'w'))`)
cw = csv.writer(sys.stdout)

# Write the header line
cw.writerow(keys)

# Write each line.
for row in rows:
    cw.writerow([row[key] for key in keys])

The output is

app,cat,known_clients,rx_bytes,rx_packets,tx_bytes,tx_packets
5,3,2,60837,203,213435,475
94,19,2,5421117,4779,243979,2377
162,20,,3295298,2935,171266,2032
209,13,,21763,38,4433,30
222,13,3,4742,32,5787,42
167,20,1,138686,237,31821,155
112,4,1,135381,161,42596,140
62,8,,7219,10,1153,9
185,20,,4733026,4666,728026,2688
130,4,1,113871,121,16442,116
65535,255,4,6417768,10254,1193280,7326
190,13,1,25041,34,32420,49
106,18,2,450,5,540,6
1,6,,23370,35,4388,26
61,9,,4825,24,2040,24
32,17,,27068,42,6002,27
3,24,2,29536,117,22861,113
63,18,,0,0,114,2
21,3,,41992,53,9788,34
21,14,,31920,114,19203,82

– and if you want to use Pandas for something else here, that rows list is easily converted to a Pandas dataframe.

edited May 7, 2019 at 10:55

answered May 7, 2019 at 8:54

AKX

171k16 gold badges147 silver badges229 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

User9923456 Over a year ago

I'm getting a error. line 23, in <module> for datum in data["by_app"] TypeError: list indices must be integers or slices, not str

AKX Over a year ago

@User9923456 Ah, right, your data is actually a list. Amended.

User9923456 Over a year ago

Mind if I asked how do I save the final output into a csv file?

AKX Over a year ago

Well, the easiest is to just pipe the output into a file: python script.py > text.csv, but I can augment the answer to add that too.

User9923456 Over a year ago

Can you add that in? Thanks in advance!

|

Numan Ijaz · Accepted Answer · 2019-05-07 09:18:39Z

0

maybe checkout the following link to flatten your json first and then use datafram to_csv method of pandas.

Additionally also have a look at the already asked question on stackoverflow.

answered May 7, 2019 at 9:18

Numan Ijaz

9291 gold badge8 silver badges18 bronze badges

1 Comment

User9923456 Over a year ago

I can't seem to json_normalize my data using the examples provided by the link. When I try to normalize I get an error of AttributeError: 'str' object has no attribute 'values'

Collectives™ on Stack Overflow

Convert nested JSON to CSV in Python 3.7

2 Answers 2

7 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related