Consider that I have a 2D numpy array where each row represents a unique item and each column within the row represents a label assigned to this item. For example, a 10 x 25 array in this instance would represent 10 items, each of which have up to 25 labels each.
What would be most efficient way to convert this to a dict (or another appropriate datatype, bonus points if it can be sorted by length) that maps labels to the rows indices in which that label occurs? For example, dict[1] would return a list of the row indices that contain 1 as a label.
For example,
Given:
[1, 2, 3]
[1, 0, 0]
[1, 3, 0]
Result:
1: 0, 1, 2 # 1 occurs in rows 0, 1, 2
3: 0, 2 # 3 occurs in rows 0, 2
0: 1, 2 # 0 occurs in rows 1, 2 (0 is padding for lack of labels)
2: 0 # 2 occurs in row 0 only
2: 0because 2 only occurs in the 0th row?