0

This question may be very basic, but I would like to concatenate three columns in a pandas DataFrame.
I would like to concatenate col1, col2 and col3 into col4. I know in R this could be done with the paste function quite easily.

df = pd.DataFrame({'col1': [2012, 2013, 2014], 'col2': 'q', 'col3': range(3)})

Edit: Code for clarity - I would like to generate col4 automatically:

x=pd.DataFrame()
x['col1'] = [2012,2013,2013]
x['col2'] = ['q', 'q', 'q']
x['col3'] = [1,2,3]
x['col4'] = ['2012q1', '2013q2', '2014q4']
0

3 Answers 3

4

Use pd.DataFrame.sum with axis=1 after converting to strings.
I use pd.DataFrame.assign to create a copy with the new column

df.assign(col4=df[['col1', 'col2', 'col3']].astype(str).sum(1))

   col1 col2  col3    col4
0  2012    q     1  2012q1
1  2013    q     2  2013q2
2  2014    q     3  2014q3

Or you can add a column inplace

df['col4'] = df[['col1', 'col2', 'col3']].astype(str).sum(1)
df

   col1 col2  col3    col4
0  2012    q     1  2012q1
1  2013    q     2  2013q2
2  2014    q     3  2014q3

If df only has the three columns, you can reduce code to

df.assign(col4=df.astype(str).sum(1))

If df has more than three columns but the three you want to concat are the first three

df.assign(col4=df.iloc[:, :3].astype(str).sum(1))
Sign up to request clarification or add additional context in comments.

3 Comments

sum on strings:)
This solution worked on the code that was provided but on my actual data set received a 'Wrong number of dimensions' error
That means you misrepresented your data. Also, I have no idea what your error means. You should post the entire error to provide more context.
2

To concatenate across all columns, it may be more convenient to write df.apply(..., axis=1), as in:

df['col4'] = df.apply(lambda x: "".join(x.astype(str)),axis=1)
df

#   col1 col2  col3    col4
#0  2012    q     1  2012q1
#1  2013    q     2  2013q2
#2  2014    q     3  2014q3

especially if you have a lot of columns and don't want to write them all out (as required by Kyle's answer).

Comments

1
df['col4'] = df.col1.astype(str) + df.col2 + df.col3.astype(str)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.