2

I have a DataFrame like this:

A    B
----------
c    d
e    f

I'd like to introduce a third column, made up of a concatenation of A, B and the index, so that the DataFrame becomes:

A    B    C
---------------
c    d    cd0
e    f    ef1

I'd like to do that like so:

df['C'] = df['A'] + df['B'] + # and here I don't know how to reference the row index. 

How can I do this?

3 Answers 3

7

Option 1
For better scalability, use assign + agg:

df['C'] = df.assign(index=df.index.astype(str)).agg(''.join, 1)
df

   A  B    C
0  c  d  cd0
1  e  f  ef1

Or, using np.add.reduce in a similar fashion:

df['C'] = np.add.reduce(df.assign(index=df.index.astype(str)), axis=1)
df

   A  B    C
0  c  d  cd0
1  e  f  ef1

Option 2
A less scalable option using vectorised string concatenation:

df['C'] = df['A'] + df['B'] + df.index.astype(str)
df

   A  B    C
0  c  d  cd0
1  e  f  ef1
Sign up to request clarification or add additional context in comments.

1 Comment

Beautiful. Thanks a lot!
4

With pd.DataFrame.itertuples
Python 3.6

df.assign(C=[f'{a}{b}{i}' for i, a, b in df.itertuples()])

   A  B    C
0  c  d  cd0
1  e  f  ef1

With pd.Series.str.cat

df.assign(C=df.A.str.cat(df.B).str.cat(df.index.astype(str)))

   A  B    C
0  c  d  cd0
1  e  f  ef1

Mish/Mash

from operator import add
from functools import reduce
from itertools import chain

df.assign(C=reduce(add, chain((df[c] for c in df), [df.index.astype(str)])))

   A  B    C
0  c  d  cd0
1  e  f  ef1

Summation

df.assign(C=df.sum(1) + df.index.astype(str))

   A  B    C
0  c  d  cd0
1  e  f  ef1

Comments

0
df['C'] = df['A'].astype(str) + df['B'].astype(str) + np.array(map(str, df.index.values))

Basically you access the df index with df.index, and to turn that into a numpy array you add the .values, and to convert that into a string (to easily add to the previous columns, which are strings), you can use a map function.

Edit: added .astype(str) to columns A and B, to convert them to strings. If they are already strings, this won't be necessary.

3 Comments

doesn't work for me, unfortunately: I get a TypeError: must be str, not map
Always prefer astype when performing type conversions in numpy/pandas. Also, your code works only in python2 only, unless you collect the output of map to a list.
@Zubo Your columns A and B must not already be strings types then. I edited my post with .astype(str) commands to convert them to strings. Sorry for the confusion. Thanks for the tip COLDSPEED.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.