pandas dataframe create new columns and fill values by using the values of the first column

Question

I have a pandas dataframe df, which has only one column col. I want to loop values of col, and add columns to fill values by using the values of the first column col. For example, the first row is a list, which has 3 elements ['text1','text2','text3']. I want to add 3 columns, and fill values using 'text1','text2' and 'text3'.

import pandas as pd

df=pd.DataFrame({'col':[['text1','text2','text3'],['mext1','mext2'],['cext1']]})
df

    col
0   [text1, text2, text3]
1   [mext1, mext2]
2   [cext1]

I want like this:

    col                     col_1     col_2     col_3
0   [text1, text2, text3]   text1     text2     text3
1   [mext1, mext2]          mext1     mext2     Nan
2   [cext1]                 cext1     Nan       Nan

Your help will be appreciated.

jezrael · Accepted Answer · 2017-01-03 10:18:12Z

3

Another solution with DataFrame constructor, where need rename columns and add_prefix:

print (pd.DataFrame(df.col.values.tolist(), index=df.col)
         .rename(columns = lambda x: x+1)
         .add_prefix('col_')
         .reset_index())

                     col  col_1  col_2  col_3
0  [text1, text2, text3]  text1  text2  text3
1         [mext1, mext2]  mext1  mext2   None
2                [cext1]  cext1   None   None

Solution where find max length of list in column col by str.len:

cols = df.col.str.len().max() + 1
print (cols)
4
print (pd.DataFrame(df.col.values.tolist(), index=df.col,columns = np.arange(1, cols))
         .add_prefix('col_')
         .reset_index())
                     col  col_1  col_2  col_3
0  [text1, text2, text3]  text1  text2  text3
1         [mext1, mext2]  mext1  mext2   None
2                [cext1]  cext1   None   None

edited Jan 3, 2017 at 10:18

answered Jan 3, 2017 at 10:13

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Nickil Maveli · Accepted Answer · 2017-01-03 10:30:34Z

3

You could construct a new dataframe by converting the values present in the single column to it's list representation form. The elements of the list would now become separate column entities in itself.

These could then be concatenated with the original DF columnwise (axis=1).

df_expand = pd.DataFrame(df['col'].tolist(), df.index)
df_expand.columns = df_expand.columns + 1
pd.concat([df['col'], df_expand.add_prefix('col_')], axis=1)

To get None to be represented as NaN, you could add .replace({None:np.NaN}) at the end of the last syntax.

edited Jan 3, 2017 at 10:30

answered Jan 3, 2017 at 9:44

Nickil Maveli

29.8k10 gold badges86 silver badges88 bronze badges

1 Comment

Mike Müller Over a year ago

Does not work for df = pd.DataFrame({'col':[['text1','text2','text3'],['mext1','mext2'],['cext1'],['cext2']]}). Problem: np.arange(1, df.shape[0] + 1).

Collectives™ on Stack Overflow

pandas dataframe create new columns and fill values by using the values of the first column

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related