4

I have a following dataframe:

df = pd.DataFrame({'scene':[{"living":"0.515","kitchen":"0.297"}, {"kitchen":"0.401","study":"0.005"}, {"study":"0.913"}, {}, {"others":"0"}], 'id':[1, 2, 3 ,4, 5]}) 

id        scene
01      {"living":"0.515","kitchen":"0.297"}
02      {"kitchen":"0.401","study":"0.005"}
03      {"study":"0.913"}
04      {}
05      {"others":"0"}

and I want to create a new dataframe as shown below, can someone help me to create this using Pandas?

id      living     kitchen     study     others
01      0.515       0.297        0         0 
02        0         0.401      0.005       0
03        0           0        0.913       0
04        0           0          0         0 
05        0           0          0         0
0

4 Answers 4

4

Simple solution is to convert your scene column to the list of dictionaries and create new data frame with default constructor:

pd.DataFrame(df.scene.tolist()).fillna(0)

Result:

  kitchen living others  study
0   0.297  0.515      0      0
1   0.401      0      0  0.005
2       0      0      0  0.913
3       0      0      0      0
4       0      0      0      0

One of the "default" way to create DataFrame is to use a list of dictionaries. In this case each dictionary of list will be converted to the separate row and each key of dict will be used for the column heading.

Sign up to request clarification or add additional context in comments.

1 Comment

you mean the method below?
2

On your data,

df = pd.DataFrame({'scene':[{"living":"0.515","kitchen":"0.297"}, {"kitchen":"0.401","study":"0.005"}, 
                        {"study":"0.913"}, {}, {"others":"0"}], 
               'id':[1, 2, 3 ,4,5], 's': ['a','b','c','d','e']})

df:
    id  s   scene
0   1   a   {'kitchen': '0.297', 'living': '0.515'}
1   2   b   {'kitchen': '0.401', 'study': '0.005'}
2   3   c   {'study': '0.913'}
3   4   d   {}
4   5   e   {'others': '0'}

There are two ways you can go about doing this,

  1. In a single line, where you have to input all column names except 'scene' to set_index function

    df = df.set_index(['id', 's'])['scene'].apply(pd.Series).fillna(0).reset_index()
    

    which will output:

       id   s   kitchen living  study   others
    0   1   a   0.297   0.515   0       0
    1   2   b   0.401   0       0.005   0
    2   3   c   0       0       0.913   0
    3   4   d   0       0       0       0
    4   5   e   0       0       0       0
    
  2. In two lines, where you create your excepted result and concat it to the original dataframe.

    df1 = df.scene.apply(pd.Series).fillna(0)
    df = pd.concat([df, df1], axis=1)
    

    which gives,

       id   s                                    scene  kitchen living  study others
    0   1   a   {'kitchen': '0.297', 'living': '0.515'} 0.297   0.515   0     0
    1   2   b    {'kitchen': '0.401', 'study': '0.005'} 0.401   0    0.005    0
    2   3   c                        {'study': '0.913'} 0       0   0.913     0
    3   4   d                                        {} 0       0      0      0
    4   5   e                           {'others': '0'} 0       0      0      0
    

21 Comments

thanks. but what i want to do is not just fillna but also transform JSON style dataframe to as you mentioned above.
Can you clarify your question? Your original dataframe is the first one and you want to transform it to the second one, right?
Yes, exactly i want transform first one to second one.
and my answer doesn't transform the first to second?
sorry nope. need to import json and use a list of dictionaries.
|
0

Updated. This one works perfectly. Welcome to give your suggestions to keep it more concise.

import json
import pandas as pd

df = pd.DataFrame({'scene':[{"living":"0.515","kitchen":"0.297"}, {"kitchen":"0.401","study":"0.005"}, {"study":"0.913"}, {}, {"others":"0"}], 'id':[1, 2, 3 ,4,5], 's':['a','b','c','d','e']}) 
def test(Scene, type):
    Scene = json.loads(Scene)
    if type in Scene.keys():
        return Scene[type]
    else:
        return ""

a = ['living', 'kitchen', 'study', 'others']
for b in a:
    df[b] = df['Scene'].map(lambda Scene: test(Scene, b.lower()))

cols = ['living', 'kitchen', 'study', 'others']
df[cols] = df[cols].replace({'': 0})
df[cols] = df[cols].apply(pd.to_numeric, errors='coerce', axis=1)

Comments

0

The perfect one line solution is here, thanks for all helps:

df.join(df['scene'].apply(json.loads).apply(pd.Series))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.