0

I'm trying to do a pd.json_normalized on a dataframe but it results into an empty dataframe.

initial dataframe (https://i.sstatic.net/jGphv.png)

after applying json_normalized,

df1 = pd.json_normalize(df)
print(df1)

it became an empty dataframe. (https://i.sstatic.net/733Dx.png)

when I tried to define the dataframe manually using below, I got my expected output

data = [
    {'birthday': '542217600000', 'first_name': 'Char', 'gender': 'Male', 'last_name': 'Mander', 'nick_name': ''},
    {'birthday': '967046400000', 'first_name': 'ABC', 'gender': 'Male', 'last_name': 'ZXY', 'nick_name': ''},
    {'birthday': '739900800000', 'first_name': 'Test', 'gender': 'Male', 'last_name': 'tickles', 'nick_name': ''}
]

birthday = pd.json_normalize(data, max_level=1)
print(birthday)

(https://i.sstatic.net/IB2th.png)

May I know how can I directly normalized from a dataframe?

0

2 Answers 2

1

You should pass a Series to json_normalize, not a Dataframe:

# your initial DataFrame
df = pd.DataFrame({'properties': data})

# passing the relevant column/Series
birthday = pd.json_normalize(df['properties'], max_level=1)

Output:

       birthday first_name gender last_name nick_name
0  542217600000       Char   Male    Mander          
1  967046400000        ABC   Male       ZXY          
2  739900800000       Test   Male   tickles          
Sign up to request clarification or add additional context in comments.

5 Comments

thank you but I was not able to mention that the initial dataframe im using is from a csv file. I used this code to load with open (input1 ,'r') as data1: df1 = pd.read_csv(data1, encoding='utf-8', dtype= 'object', sep=',') then running json normalized provided me this error code line 461, in _json_normalize if any([isinstance(x, dict) for x in y.values()] for y in data): File "C:\Users\xxx\pandas\io\json_normalize.py", line 461, in <genexpr> if any([isinstance(x, dict) for x in y.values()] for y in data): AttributeError: 'str' object has no attribute 'values'
can you provide the content of the csv file for debugging?
cardinality|create_time|properties 382|2022-04-05 16:32:43.000+0000|{'birthday': '542217600000', 'first_name': 'Char', 'gender': 'Male', 'last_name': 'Mander', 'nick_name': ''} 400|2022-02-07 14:20:59.000+0000|{'birthday': '967046400000', 'first_name': 'ABC', 'gender': 'Male', 'last_name': 'ZXY', 'nick_name': ''} 132|2021-08-09 14:01:30.000+0000|{'birthday': '739900800000', 'first_name': 'Test', 'gender': 'Male', 'last_name': 'Tickles', 'nick_name': ''}
Complete steps im doing: 1. with open (input1 ,'r') as data1: df1 = pd.read_csv(data1, encoding='utf-8', dtype= 'object', sep=',') 2. df1=df1['properties'] 3. pd.json_normalized(df1)
Can you add these details directly in the question?
0

Note that calling dict() on a pd.DataFrame does not result in a valid JSON object, therefore you need a workaround, e.g.:

regular_json = [row[1].to_dict() for row  in df.iterrows()]
pd.json_normalize(regular_json)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.