I have a dataset
Name Subset Type System
A00 IU00-A OP A
A00 IT00 PP A
B01 IT-01A PP B
B01 IU OP B
B03 IM-09-B LP A
B03 IM03A OP B
B03 IT-09 OP A
D09 IT OP B
D09 IM LP B
D09 IM OP A
So here I need to group the Name column such that Subset and Type are similar. We have to only consider the first alphabetical part of the subset column and ignore rest. for eg IM-09-B, IM03A can be considered as IM.
Output needed
Subset Cluster Type Cluster Name System
IU,IT OP,PP A00,B01 A,A,B,B
IM,IM,IT LP, OP, OP B03, D09 A,B,A,B,B,A
Here the first cluster instance is formed coz IU is OP and IT is PP in both cases, similar for the second instance.