How to add string values of columns with a specific condition in a new column

Question

So I have a dataframe in which there are a couple of columns and a lot of rows.

Now I want to create a new column (C) which adds values of another column (A) as a string together if a third column (B) is identical.

So each 'group' (that is identical in B) should have a different string than the other groups in that column in the end.

Something like this pseudo code:

for x in df[B]:
if (x "is identical to" x "of another row"):
df[C] = df[C].cat(df[A])

How do I code an algorithm that can do this?

mozway · Accepted Answer · 2022-08-11 13:58:48Z

1

You can use:

df['C'] = df.groupby('B')['A'].transform('_'.join)

Or, if you want to keep only unique values:

df['C'] = df.groupby('B')['A'].transform(lambda x: '_'.join(x.unique()))

output:

        A   B              C
0   First   1    First_Third
1  Second  22  Second_Fourth
2   Third   1    First_Third
3  Fourth  22  Second_Fourth

answered Aug 11, 2022 at 13:58

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

gtomer · Accepted Answer · 2022-08-11 13:58:37Z

0

Try this:

df['C'] = df.groupby('B')['A'].transform(lambda x: '_'.join(x))

answered Aug 11, 2022 at 13:58

gtomer

6,6041 gold badge14 silver badges29 bronze badges