I have some json with multiple blocks with this format (using only one here to make it simple, so in this example the dataframe would have only one line):
{
"A": 1,
"B": {
"C": [
{
"D": 2,
"E": 3
}
],
"F": {
"G": 4,
"H": 5
}
}
}
And I want to create a DataFrame like this:
A B.C.D B.C.E B.F.G B.F.H
1 1 2 3 4 5
When I try to do
with open('some.json') as file:
data = json.load(file)
df = pd.json_normalize(data)
I get something like this:
A B.C B.F.G B.F.H
1 1 [{"D":2,"E":3}] 4 5
So... I can get the column B.C, break it into the B.C.D and B.C.E
df2 = pd.DataFrame(df['B.C'].tolist())
df3 = df2[0].apply(pd.Series) #The [0] here is the only way to work when I have more than one block in the json
Them later concatenate with the previous dataframe (and removing the B.C column) but it feels ugly and since I'm doing this a LOT I was thinking if there's a cleaner/faster way.
Well, thanks in advance!
What should I do when someone answers my question?