Suppose I have this DataFrame:
df = pd.DataFrame({'col1': ['AC1', 'AC2', 'AC3', 'AC4', 'AC5'],
'col2': ['A', 'B', 'B', 'A', 'C'],
'col3': ['ABC', 'DEF', 'FGH', 'IJK', 'LMN']})
I want to comnbine text of 'col3' if values in 'col2' are duplicated. The result should be like this:
col1 col2 col3
0 AC1 A ABC, IJK
1 AC2 B DEF, FGH
2 AC3 B DEF, FGH
3 AC4 A ABC, IJK
4 AC5 C LMN
I start this excercise by finding duplicated values in this dataframe:
col2 = df['col2']
df1 = df[col2.isin(col2[col2.duplicated()])]
Any suggestion what I should do next?