Adding two columns in Python

Question

I am trying to add two columns and create a new one. This new column should become the first column in the dataframe or the output csv file.

column_1 column_2
84       test
65       test

Output should be

column         column_1 column_2
trial_84_test   84      test
trial_65_test   65      test

I tried below given methods but they did not work:

sum = str(data['column_1']) + data['column_2']

data['column']=data.apply(lambda x:'%s_%s_%s' % ('trial' + data['column_1'] + data['column_2']),axis=1)

Help is surely appreciated.

Alexander · Accepted Answer · 2018-03-22 02:47:29Z

Create sample data:

df = pd.DataFrame({'column_1': [84, 65], 'column_2': ['test', 'test']})

Method 1: Use assign to create new column, and then reorder.

>>> df.assign(column=['trial_{}_{}'.format(*cols) for cols in df.values])[['column'] + df.columns.tolist()]
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 2: Create a new series and then concatenate.

s = pd.Series(['trial_{}_{}'.format(*cols) for cols in df.values], index=df.index, name='column')
>>> pd.concat([s, df], axis=1)
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 3: Insert the new values at the first index of the dataframe (i.e. column 0).

df.insert(0, 'column', ['trial_{}_{}'.format(*cols) for cols in df.values])
>>> df
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 3 (alternative way to create values for new column):

df.insert(0, 'column', df.astype(str).apply(lambda row: 'test_' + '_'.join(row), axis=1))

By the way, sum is a keyword so you do not want to use it as a variable name.

jpp · Accepted Answer · 2018-03-22 01:34:43Z

3

Do not use lambda for this, as it is just a thinly veiled loop. Here is a vectorised solution. Care needs to be taken to convert non-string values to str type.

df['column'] = 'trial_' + df['column_1'].astype(str) + '_' + df['column_2']

df = df.reindex_axis(sorted(df.columns), axis=1)  # sort columns alphabetically

Result:

          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

answered Mar 22, 2018 at 1:34

jpp

166k37 gold badges301 silver badges362 bronze badges

1 Comment

Peter Barrett Bryan Over a year ago

Exactly the answer I would give! Deserves the upvote and accept

BENY · Accepted Answer · 2018-03-22 01:54:36Z

0

You can using insert

df.insert(0,column='Columns',value='trial_' + df['column_1'].astype(str)+ '_'+df['column_2'].astype(str)
)
df
Out[658]: 
         Columns  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

answered Mar 22, 2018 at 1:54

BENY

324k22 gold badges176 silver badges250 bronze badges

Collectives™ on Stack Overflow

Adding two columns in Python

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related