1

So I have a dataframe in which there are a couple of columns and a lot of rows.

Now I want to create a new column (C) which adds values of another column (A) as a string together if a third column (B) is identical.

So each 'group' (that is identical in B) should have a different string than the other groups in that column in the end.

A B New Column C
First 1 First_Third
Second 22 Second_Fourth
Third 1 First_Third
Fourth 22 Second_Fourth

Something like this pseudo code:

for x in df[B]:
if (x "is identical to" x "of another row"):
df[C] = df[C].cat(df[A])

How do I code an algorithm that can do this?

2 Answers 2

1

You can use:

df['C'] = df.groupby('B')['A'].transform('_'.join)

Or, if you want to keep only unique values:

df['C'] = df.groupby('B')['A'].transform(lambda x: '_'.join(x.unique()))

output:

        A   B              C
0   First   1    First_Third
1  Second  22  Second_Fourth
2   Third   1    First_Third
3  Fourth  22  Second_Fourth
Sign up to request clarification or add additional context in comments.

Comments

0

Try this:

df['C'] = df.groupby('B')['A'].transform(lambda x: '_'.join(x))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.