1

I'm working on python 3.4 and I have a pandas dataframe column containing:

0    [0.3785766661167145, -0.449486643075943, -0.15...]
1    [0.204025000333786, -0.3685399889945984, 0.231...]
2    [0.684576690196991, -0.5823000073432922, 0.269...]
3    [-0.02300500124692917, -0.22056499123573303, 0...]
Name: comments, dtype: object

and i would like to split it and turn it into multople columns:

   column1               column2              ...columnx
0  0.3785766661167145    -0.449486643075943     last element in the first list
1  0.204025000333786     -0.3685399889945984    last element in the 2nd list
2  0.684576690196991     -0.5823000073432922    last element in the 3rd list
3  -0.02300500124692917  -0.22056499123573303   last element in the 4th list

Could you please help me ? Thanks in advance

0

4 Answers 4

1

If in data are lists need DataFrame constructor with converting column comments to numpy array by values + tolist:

print (type(df.loc[0, 'comments']))
<class 'list'>

df1 = pd.DataFrame(df['comments'].values.tolist())
#rename columns if necessary
df1 = df1.rename(columns = lambda x: 'column' + str(x + 1))
print (df1)
    column1   column2  column3
0  0.378577 -0.449487   -0.150
1  0.204025 -0.368540    0.231
2  0.684577 -0.582300    0.269
3 -0.023005 -0.220565    0.000
Sign up to request clarification or add additional context in comments.

Comments

0

Having a DataFrame

df = pd.Series(
    {'comments': [list(np.random.randn(3).round()) for i in range(4)]
    }
)

where df=

comments
0    [1.0, -2.0, 0.0]
1   [1.0, -3.0, -0.0]
2  [-0.0, -0.0, -1.0]
3  [-2.0, -2.0, -2.0]

Calling

df2 = DataFrame(list(df['comments']))

you obtain

     0    1    2
0  1.0 -2.0  0.0
1  1.0 -3.0 -0.0
2 -0.0 -0.0 -1.0
3 -2.0 -2.0 -2.0

Comments

0

Test Case:

import pandas as pd
df = pd.DataFrame({
               'var1':['20, -20, -50','30, 20, -50','40','30'],
               'var2':['10','50','60','70']
              })
print(df)

    var1           var2
0   20, -20, -50    10
1   30, 20, -50     50
2   40              60
3   30              70

pd.concat([df[['var2']], df['var1'].str.split(',', expand=True)], axis=1)

enter image description here

Comments

0

Using @dDanny's example Dataframe,

df = pd.DataFrame(
    {'comments': [list(np.random.randn(3).round()) for i in range(4)]
    })

You can use apply to transform the column containing lists to a Dataframe.

 df.comments.apply(pd.Series)
Out[127]: 
     0    1    2
0 -2.0 -3.0 -1.0
1  1.0  0.0  1.0
2 -1.0 -1.0 -0.0
3  1.0  1.0  0.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.