Split pandas column python

Question

I'm working on python 3.4 and I have a pandas dataframe column containing:

0    [0.3785766661167145, -0.449486643075943, -0.15...]
1    [0.204025000333786, -0.3685399889945984, 0.231...]
2    [0.684576690196991, -0.5823000073432922, 0.269...]
3    [-0.02300500124692917, -0.22056499123573303, 0...]
Name: comments, dtype: object

and i would like to split it and turn it into multople columns:

   column1               column2              ...columnx
0  0.3785766661167145    -0.449486643075943     last element in the first list
1  0.204025000333786     -0.3685399889945984    last element in the 2nd list
2  0.684576690196991     -0.5823000073432922    last element in the 3rd list
3  -0.02300500124692917  -0.22056499123573303   last element in the 4th list

Could you please help me ? Thanks in advance

jezrael · Accepted Answer · 2017-05-29 10:15:21Z

1

If in data are lists need DataFrame constructor with converting column comments to numpy array by values + tolist:

print (type(df.loc[0, 'comments']))
<class 'list'>

df1 = pd.DataFrame(df['comments'].values.tolist())
#rename columns if necessary
df1 = df1.rename(columns = lambda x: 'column' + str(x + 1))
print (df1)
    column1   column2  column3
0  0.378577 -0.449487   -0.150
1  0.204025 -0.368540    0.231
2  0.684577 -0.582300    0.269
3 -0.023005 -0.220565    0.000

edited May 29, 2017 at 10:15

answered May 29, 2017 at 10:08

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Danny · Accepted Answer · 2017-05-29 10:16:57Z

0

Having a DataFrame

df = pd.Series(
    {'comments': [list(np.random.randn(3).round()) for i in range(4)]
    }
)

where df=

comments
0    [1.0, -2.0, 0.0]
1   [1.0, -3.0, -0.0]
2  [-0.0, -0.0, -1.0]
3  [-2.0, -2.0, -2.0]

Calling

df2 = DataFrame(list(df['comments']))

you obtain

     0    1    2
0  1.0 -2.0  0.0
1  1.0 -3.0 -0.0
2 -0.0 -0.0 -1.0
3 -2.0 -2.0 -2.0

answered May 29, 2017 at 10:16

Danny

3212 silver badges8 bronze badges

Comments

Shubham R · Accepted Answer · 2017-05-29 09:55:29Z

0

Test Case:

import pandas as pd
df = pd.DataFrame({
               'var1':['20, -20, -50','30, 20, -50','40','30'],
               'var2':['10','50','60','70']
              })
print(df)

    var1           var2
0   20, -20, -50    10
1   30, 20, -50     50
2   40              60
3   30              70

pd.concat([df[['var2']], df['var1'].str.split(',', expand=True)], axis=1)

answered May 29, 2017 at 9:55

Shubham R

7,67618 gold badges65 silver badges127 bronze badges

Comments

Allen Qin · Accepted Answer · 2017-05-29 12:02:48Z

0

Using @dDanny's example Dataframe,

df = pd.DataFrame(
    {'comments': [list(np.random.randn(3).round()) for i in range(4)]
    })

You can use apply to transform the column containing lists to a Dataframe.

 df.comments.apply(pd.Series)
Out[127]: 
     0    1    2
0 -2.0 -3.0 -1.0
1  1.0  0.0  1.0
2 -1.0 -1.0 -0.0
3  1.0  1.0  0.0

answered May 29, 2017 at 12:02

Allen Qin

20k9 gold badges55 silver badges68 bronze badges

Collectives™ on Stack Overflow

Split pandas column python

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related