0

I have a dataset

Name    Subset    Type    System
A00     IU00-A    OP      A
A00     IT00      PP      A
B01     IT-01A    PP      B
B01     IU        OP      B
B03     IM-09-B   LP      A
B03     IM03A     OP      B
B03     IT-09     OP      A
D09     IT        OP      B
D09     IM        LP      B
D09     IM        OP      A

So here I need to group the Name column such that Subset and Type are similar. We have to only consider the first alphabetical part of the subset column and ignore rest. for eg IM-09-B, IM03A can be considered as IM.

Output needed

Subset Cluster    Type Cluster    Name        System
IU,IT             OP,PP           A00,B01     A,A,B,B
IM,IM,IT          LP, OP, OP      B03, D09    A,B,A,B,B,A

Here the first cluster instance is formed coz IU is OP and IT is PP in both cases, similar for the second instance.

1 Answer 1

1

You could use groupby + agg(join) twice; once to make groups out of "Names"; then again to make groups out of "Subset"-"Type" pairs. Since we're joining strings, we first use sort_values so that when we join later, each group will have the same order of elements.

out = (df.assign(Subset=df['Subset'].str[:2])
       .sort_values(by=df.columns.tolist())
       .groupby('Name').agg(', '.join)
       .add_suffix(' Cluster').reset_index()
       .set_index(['Subset Cluster', 'Type Cluster'])
       .groupby(level=[0,1]).agg(', '.join).reset_index())

Output:

  Subset Cluster Type Cluster      Name
0     IM, IM, IT   LP, OP, OP  B03, D09
1         IT, IU       OP, PP  A00, B01
Sign up to request clarification or add additional context in comments.

4 Comments

Please check the edited question. Another Column 'System' is added.
Yes, thank you very much. stackoverflow.com/q/71719509/17778275 Pl find the question here.
Please check the edited question. Added a 'System' Column, which for each 'Name' are added are mentioned.
@spd this code works on the new sample as well since we're agnostic about "System" here. Please test it and if you have an example where it breaks let me know.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.