0

I am dealing with a JSON file which has nested fields (arrays). I am trying to convert the same into a Pandas dataframe.

{
    "_id": "2026",
    "dataDate": 1537920000000,
    "dataYear": 2018,
    "groupId": "1378",
    "HourConsumed": 19781.4,
    "HourGenerated": 0,
    "max": 4658.400000000001,
    "maxGen": 0,
    "maxTime": 1538001000000,
    "avg": -206.05625,
    "max": 0,
    "maxGen": 0,
    "maxTime": null,
    "avgTemp": 0,
    "me_Id": "2004506_3166155129",
    "interval": 15,
    "intervalMetaData": [
        "whC",
        "whG",
        "max",
        "maxGen",
        "hC",
        "hG",
        "maxVar",
        "maxGen",
        "avgTemp",
        "eventTime"
    ],
    "intervalData": [
        [
            175.2,
            0,
            700.8,
            0,
            0,
            0,
            0,
            0,
            0,
            1537920900000
        ],
        [
            192,
            0,
            768,
            0,
            0,
            0,
            0,
            0,
            0,
            1537921800000
        ],
        [
            191.39999999999998,
            0,
            765.5999999999999,
            0,
            0,
            0,
            0,
            0,
            0,
            1537922700000
        ]
    ]
}

I need to create separate columns of what's inside intervalMetaData, and then filling those columns with values from intervalData. Is it possible?

2 Answers 2

1

If I understand correctly, you just have to set your columns right by importing your list of lists with pandas:

import pandas as pd

data = {
    "_id": "2026",
    "dataDate": 1537920000000,
    "dataYear": 2018,
    "groupId": "1378",
    "HourConsumed": 19781.4,
    "HourGenerated": 0,
    "max": 4658.400000000001,
    "maxGen": 0,
    "maxTime": 1538001000000,
    "avg": -206.05625,
    "max": 0,
    "maxGen": 0,
    "maxTime": None,
    "avgTemp": 0,
    "me_Id": "2004506_3166155129",
    "interval": 15,
    "intervalMetaData": [
        "whC",
        "whG",
        "max",
        "maxGen",
        "hC",
        "hG",
        "maxVar",
        "maxGen",
        "avgTemp",
        "eventTime"
    ],
    "intervalData": [
        [
            175.2,
            0,
            700.8,
            0,
            0,
            0,
            0,
            0,
            0,
            1537920900000
        ],
        [
            192,
            0,
            768,
            0,
            0,
            0,
            0,
            0,
            0,
            1537921800000
        ],
        [
            191.39999999999998,
            0,
            765.5999999999999,
            0,
            0,
            0,
            0,
            0,
            0,
            1537922700000
        ]
    ]
}


df = pd.DataFrame(data["intervalData"], columns=data["intervalMetaData"])
print(df)

Output:

     whC  whG    max  maxGen  hC  hG  maxVar  maxGen  avgTemp      eventTime
0  175.2    0  700.8       0   0   0       0       0        0  1537920900000
1  192.0    0  768.0       0   0   0       0       0        0  1537921800000
2  191.4    0  765.6       0   0   0       0       0        0  1537922700000

Edit: you can add the other keys as columns with a loop:

for k,v in data.items():
    if k not in ["intervalData", "intervalMetaData"]:
        df[k] = v
Sign up to request clarification or add additional context in comments.

2 Comments

That's great. Would it be possible to get the other fields as well in the same dataframe? It might be repeated I understand but that's not a problem
sure! I added a loop to my answer
1

You bet it's possible! It's as simple as this:

df = pd.DataFrame(j['intervalData'], columns=j['intervalMetaData'])

1 Comment

@shivanshu does this not solve your problem?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.