Best way to transform JSON data within a Pandas dataframe into a dataframe itself

Question

I have a Pandas dataframe where one column contains a non-nested json object in each row.

                             js
0  {"k1":"1","k2":"A","k3":"X"}
1  {"k1":"2","k2":"B","k3":"X"}
2  {"k1":"3","k2":"A","k3":"Y"}
3  {"k1":"4","k2":"D","k4":"M"}

Created like this:

import pandas as pd
L0 = ['{"k1":"1","k2":"A","k3":"X"}',
      '{"k1":"2","k2":"B","k3":"X"}',
      '{"k1":"3","k2":"A","k3":"Y"}',
      '{"k1":"4","k2":"D","k4":"M"}']
df = pd.DataFrame({'js':L0})

I want to make the json-objects into their own dataframe:

  k1 k2   k3   k4
0  1  A    X  NaN
1  2  B    X  NaN
2  3  A    Y  NaN
3  4  D  NaN    M

Right now the only way I know is by using the json module and df.iterrows():

import json
all_json = []
for _,row in df.iterrows():
    all_json.append(json.loads(row["js"]))
df2 = pd.DataFrame.from_dict(all_json)

Is there a better way to do this, ideally without iterating?

EDIT 1:

Thanks for the answers.

I have timed the three suggested approaches using ast.literal_eval on my real world data where my own approach takes 158 ms ± 4.01 ms:

df = df.apply(lambda x: ast.literal_eval(x[0]), 1).apply(pd.Series) takes 640 ms ± 7.8 ms

df['js'].apply(ast.literal_eval).apply(pd.Series) takes 636 ms ± 19 ms

pd.DataFrame(df.js.apply(ast.literal_eval).tolist()) takes 180 ms ± 5.11

As suggested the third approach is the fastest, but sadly they are all slower than the iterrows-approach while my intention was to get rid of iterrows to make it faster.

EDIT 2: pd.DataFrame(df["js"].apply(json.loads).tolist()) takes 25.2 ms ± 512 µs so we have a winner I guess.

Space Impact · Accepted Answer · 2019-05-13 07:31:32Z

3

Use ast.literal_eval and apply pd.Series as:

import ast
df = df.apply(lambda x: ast.literal_eval(x[0]), 1).apply(pd.Series)

print(df)
  k1 k2   k3   k4
0  1  A    X  NaN
1  2  B    X  NaN
2  3  A    Y  NaN
3  4  D  NaN    M

OR:

df = pd.DataFrame([ast.literal_eval(i) for i in df['js']])

OR:

import json
df = pd.DataFrame([json.loads(i) for i in df['js']])

edited May 13, 2019 at 7:31

answered May 13, 2019 at 6:34

Space Impact

13.3k26 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Khris Over a year ago

Thanks, but this approach is much slower than mine, see Edit.

Space Impact Over a year ago

@Khris Check my new approach.

Khris Over a year ago

Second one is faster, third one is as fast as the to_list-approach by @anky_91.

anky · Accepted Answer · 2019-05-13 07:33:42Z

2

I would call the dataframe constructor after converting the string to dict ( i think this would be faster):

import ast
pd.DataFrame(df.js.apply(ast.literal_eval).tolist())

Or:

import json
pd.DataFrame(df["js"].apply(json.loads).tolist())

  k1 k2   k3   k4
0  1  A    X  NaN
1  2  B    X  NaN
2  3  A    Y  NaN
3  4  D  NaN    M

edited May 13, 2019 at 7:33

answered May 13, 2019 at 6:55

anky

75.3k11 gold badges46 silver badges76 bronze badges

3 Comments

Khris Over a year ago

Thanks, but this approach is much slower than mine, see Edit.

anky Over a year ago

@Khris how about pd.DataFrame(df["js"].apply(json.loads).tolist()) ?

Khris Over a year ago

25.2 ms ± 512 µs, that's what I was looking for.

U13-Forward · Accepted Answer · 2019-05-13 06:36:43Z

1

You can use apply(pd.Series):

import ast
print(df['js'].apply(ast.literal_eval).apply(pd.Series))

Output:

  k1 k2   k3   k4
0  1  A    X  NaN
1  2  B    X  NaN
2  3  A    Y  NaN
3  4  D  NaN    M

answered May 13, 2019 at 6:36

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

1 Comment

Khris Over a year ago

Thanks, this approach is almost as fast as mine, but sadly still slower.

Collectives™ on Stack Overflow

Best way to transform JSON data within a Pandas dataframe into a dataframe itself

3 Answers 3

3 Comments

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related