How can I binarize a dataset according to the index? E.g.
A B C
idUser
3 1 1 1
2 0 1 0
4 1 0 0
I have tried using pd.get_dummies but the result is almost what I need.
dictio = {'idUser': [3, 3, 3, 2, 4], 'artist': ['A', 'B', 'C', 'B', 'A']}
df = pd.DataFrame(dictio)
df = df.set_index('idUser')
df_binary = pd.get_dummies(df, columns=['artist'])
print(df_binary)
A B C
idUser
3 1 0 0
3 0 1 0
3 0 0 1
2 0 1 0
4 1 0 0
df_binary.groupby('idUser', sort=False).max()solve your question?