Python pandas json_normalized a dataframe

Question

I'm trying to do a pd.json_normalized on a dataframe but it results into an empty dataframe.

initial dataframe (https://i.sstatic.net/jGphv.png)

after applying json_normalized,

df1 = pd.json_normalize(df)
print(df1)

it became an empty dataframe. (https://i.sstatic.net/733Dx.png)

when I tried to define the dataframe manually using below, I got my expected output

data = [
    {'birthday': '542217600000', 'first_name': 'Char', 'gender': 'Male', 'last_name': 'Mander', 'nick_name': ''},
    {'birthday': '967046400000', 'first_name': 'ABC', 'gender': 'Male', 'last_name': 'ZXY', 'nick_name': ''},
    {'birthday': '739900800000', 'first_name': 'Test', 'gender': 'Male', 'last_name': 'tickles', 'nick_name': ''}
]

birthday = pd.json_normalize(data, max_level=1)
print(birthday)

(https://i.sstatic.net/IB2th.png)

May I know how can I directly normalized from a dataframe?

mozway · Accepted Answer · 2023-02-07 14:10:29Z

1

You should pass a Series to json_normalize, not a Dataframe:

# your initial DataFrame
df = pd.DataFrame({'properties': data})

# passing the relevant column/Series
birthday = pd.json_normalize(df['properties'], max_level=1)

Output:

       birthday first_name gender last_name nick_name
0  542217600000       Char   Male    Mander          
1  967046400000        ABC   Male       ZXY          
2  739900800000       Test   Male   tickles

answered Feb 7, 2023 at 14:10

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

chichi Over a year ago

thank you but I was not able to mention that the initial dataframe im using is from a csv file. I used this code to load with open (input1 ,'r') as data1: df1 = pd.read_csv(data1, encoding='utf-8', dtype= 'object', sep=',') then running json normalized provided me this error code line 461, in _json_normalize if any([isinstance(x, dict) for x in y.values()] for y in data): File "C:\Users\xxx\pandas\io\json_normalize.py", line 461, in <genexpr> if any([isinstance(x, dict) for x in y.values()] for y in data): AttributeError: 'str' object has no attribute 'values'

mozway Over a year ago

can you provide the content of the csv file for debugging?

chichi Over a year ago

cardinality|create_time|properties 382|2022-04-05 16:32:43.000+0000|{'birthday': '542217600000', 'first_name': 'Char', 'gender': 'Male', 'last_name': 'Mander', 'nick_name': ''} 400|2022-02-07 14:20:59.000+0000|{'birthday': '967046400000', 'first_name': 'ABC', 'gender': 'Male', 'last_name': 'ZXY', 'nick_name': ''} 132|2021-08-09 14:01:30.000+0000|{'birthday': '739900800000', 'first_name': 'Test', 'gender': 'Male', 'last_name': 'Tickles', 'nick_name': ''}

chichi Over a year ago

Complete steps im doing: 1. with open (input1 ,'r') as data1: df1 = pd.read_csv(data1, encoding='utf-8', dtype= 'object', sep=',') 2. df1=df1['properties'] 3. pd.json_normalized(df1)

mozway Over a year ago

Can you add these details directly in the question?

P.Jo · Accepted Answer · 2023-02-07 14:40:39Z

0

Note that calling dict() on a pd.DataFrame does not result in a valid JSON object, therefore you need a workaround, e.g.:

regular_json = [row[1].to_dict() for row  in df.iterrows()]
pd.json_normalize(regular_json)

answered Feb 7, 2023 at 14:40

P.Jo

7024 silver badges10 bronze badges

Collectives™ on Stack Overflow

Python pandas json_normalized a dataframe

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related