Pandas - Splitting data from one column into multiple columns

Question

I have a Dataframe in the below format:

id, data
101, [{"tree":[
               {"Group":"1001","sub-group":3,"Child":"100267","Child_1":"8 cm"},
               {"Group":"1002","sub-group":1,"Child":"102280","Child_1":"4 cm"},
               {"Group":"1003","sub-group":0,"Child":"102579","Child_1":"0.1 cm"}]}]
102, [{"tree":[
               {"Group":"2001","sub-group":3,"Child":"200267","Child_1":"6 cm"},
               {"Group":"2002","sub-group":1,"Child":"202280","Child_1":"4 cm"}]}]
103,

I am trying to have data from this one column split into multiple columns

Expected output:

id, Group, sub-group, Child, Child_1, Group, sub-group, Child, Child_1, Group, sub-group, Child, Child_1
101, 1001, 3, 100267, 8 cm, 1002, 1, 102280, 4 cm, 1003, 0, 102579, 0.1 cm
102, 2001, 3, 200267, 6 cm, 2002, 1, 2022280, 4 cm
103

Output of df.loc[:15, ['id','data']].to_dict()

{'id': {1: '101',
        4: '102',
        11: '103',
        15: '104',
        16: '105'},
        'data': {1: '[{"tree":[{"Group":"","sub-group":"3","Child":"100267","Child_1":"8 cm"}]}]',
        4: '[{"tree":[{"sub-group":"0.01","Child_1":"4 cm"}]}]',
        11: '[{"tree":[{"sub-group":null,"Child_1":null}]}]',
        15: '[{"tree":[{"Group":"1003","sub-group":15,"Child":"child_","Child_1":"41 cm"}]}]',
        16: '[{"tree":[{"sub-group":"0.00","Child_1":"0"}]}]'}}

Ben.T · Accepted Answer · 2020-05-06 18:23:11Z

2

you can use explode on the column data, create a dataframe from it, add a cumcount column, then some shape change with set_index, stack, unstack and drop to fit your expected output, join back to the column id

s = df['data'].dropna().str['tree'].explode()
df_f = df[['id']].join(pd.DataFrame(s.tolist(), s.index)\
                         .assign(cc=lambda x: x.groupby(level=0).cumcount()+1)\
                         .set_index('cc', append=True)\
                         .stack()\
                         .unstack(level=[-2,-1])\
                         .droplevel(0, axis=1), 
                       how='left')
print (df_f)
    id Group sub-group   Child Child_1 Group sub-group   Child Child_1 Group  \
0  101  1001         3  100267    8 cm  1002         1  102280    4 cm  1003   
1  102  2001         3  200267    6 cm  2002         1  202280    4 cm   NaN   
2  103   NaN       NaN     NaN     NaN   NaN       NaN     NaN     NaN   NaN   

  sub-group   Child Child_1  
0         0  102579  0.1 cm  
1       NaN     NaN     NaN  
2       NaN     NaN     NaN

Note: while it does fit your expected output, having several times the same column name is not really a good practice. I would rather remove the method drop and flatten the multiindex column.

Edit: After some comments, I guess one way to actually go through the whole column with some weird format:

import ast
def f(x):
    try: 
        return ast.literal_eval(x.replace('null', "'nan'"))[0]['tree'] 
    except:
        return [{}]
# then create s with 
s = df['data'].apply(f).explode()
# then create df_f like above

edited May 6, 2020 at 18:23

answered May 6, 2020 at 16:46

Ben.T

29.7k6 gold badges39 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Umar.H Over a year ago

nice one, I couldn't get the indices to align in mine

Kevin Nash Over a year ago

@Ben.T thanks for the reply. I however just get the id column returned back when I try the above code.. using pandas version 1.0.1

Kevin Nash Over a year ago

@Ben.T type(df['data'].iloc[0]) returns str

Kevin Nash Over a year ago

@Ben.T, after importing ast module, it gave me this message ValueError: malformed node or string: <_ast.Name object at 0x12208b8d0>. Have edited my post with the output of df.loc[:3, ['id','data']].to_dict()

Ben.T Over a year ago

@KevinNash see my edit to create s, without all the data and seeing all the exception possible in the format, it is the only thing I can think of.

|

Collectives™ on Stack Overflow

Pandas - Splitting data from one column into multiple columns

1 Answer 1

9 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Your Answer

Sign up or log in

Post as a guest

Related