0

I've the following panda data:

df = {'ID_1': [1,1,1,2,2,3,4,4,4,4],
      'ID_2': ['a', 'b', 'c', 'f', 'g', 'd', 'v', 'x', 'y', 'z']
     }
df = pd.DataFrame(df)
display(df)

ID_1    ID_2
1   a
1   b
1   c
2   f
2   g
3   d
4   v
4   x
4   y
4   z

For each ID_1, I need to find the combination (order doesn't matter) of ID_2. For example,

When ID_1 = 1, the combinations are ab, ac, bc. When ID_1 = 2, the combination is fg.

Note, if the frequency of ID_1<2, then there is no combination here (see ID_1=3, for example).

Finally, I need to store the combination results in df2 as follows:

enter image description here

1
  • Why isn't b to c in df2? Commented Mar 21, 2022 at 4:18

1 Answer 1

6

One way using itertools.combinations:

from itertools import combinations

def comb_df(ser):
    return pd.DataFrame(list(combinations(ser, 2)), columns=["from", "to"])

new_df = df.groupby("ID_1")["ID_2"].apply(comb_df).reset_index(drop=True)

Output:

  from to
0    a  b
1    a  c
2    b  c
3    f  g
4    v  x
5    v  y
6    v  z
7    x  y
8    x  z
9    y  z
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.