Combine text in dataframe python

Question

Suppose I have this DataFrame:

df = pd.DataFrame({'col1': ['AC1', 'AC2', 'AC3', 'AC4', 'AC5'], 
                   'col2': ['A', 'B', 'B', 'A', 'C'], 
                   'col3': ['ABC', 'DEF', 'FGH', 'IJK', 'LMN']})

I want to comnbine text of 'col3' if values in 'col2' are duplicated. The result should be like this:

    col1  col2       col3
0   AC1    A      ABC, IJK
1   AC2    B      DEF, FGH
2   AC3    B      DEF, FGH
3   AC4    A      ABC, IJK
4   AC5    C      LMN

I start this excercise by finding duplicated values in this dataframe:

col2 = df['col2']
df1 = df[col2.isin(col2[col2.duplicated()])]

Any suggestion what I should do next?

moys · Accepted Answer · 2019-12-11 02:13:42Z

3

You can use

a = df.groupby('col2').apply(lambda group: ','.join(group['col3']))
df['col3'] = df['col2'].map(a)

Output

print(df)
   col1     col2    col3
0   AC1     A   ABC,IJK
1   AC2     B   DEF,FGH
2   AC3     B   DEF,FGH
3   AC4     A   ABC,IJK
4   AC5     C   LMN

answered Dec 11, 2019 at 2:13

moys

8,1173 gold badges19 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Paul Lo · Accepted Answer · 2019-12-11 02:13:54Z

3

You might want to leverage the groupby and apply functions in Pandas

df.groupby('col2').apply(lambda group: ','.join(group['col3']))

edited Dec 11, 2019 at 2:13

answered Dec 11, 2019 at 2:02

Paul Lo

6,1466 gold badges33 silver badges37 bronze badges

5 Comments

adhg Over a year ago

don't forget to close your parenthesis at the end. You may want to add a comma in group: ', '.join... to answer the OP in full according to col3

adhg Over a year ago

great - check your answer it's a bit off from the OPs request (see number of rows in your result vs his/her result)

Paul Lo Over a year ago

k it's comma join

moys Over a year ago

I think map was missing. I have added it in my answer. Up-voted your answer as well.

Paul Lo Over a year ago

@moys Voted yours, mine requires more steps to clean up the index : p

Collectives™ on Stack Overflow

Combine text in dataframe python

2 Answers 2

Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related