0

I have below DataFrame & list of data

data = [['tom', 10], ['nick', 15], ['juli', 14],
        ['test',14], ['test1',12],['test1',14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
>>> df
    Name  Age
0    tom   10
1   nick   15
2   juli   14
3   test   14
4  test1   12
5  test1   14
index_list=[['test1','juli'],['nick'],['tom','test']]
>>> index_list
[['test1', 'juli'], ['nick'], ['tom', 'test']]

I would like to add a column cluster_id to DataFrame based on index of Name in the list, so output should be like

>>> df
    Name  Age cluster_id
0    tom   10          2
1   nick   15          1
2   juli   14          0
3   test   14          2
4  test1   12          0
5  test1   14          0

1 Answer 1

1

You could convert index_list to a dictionary that maps names to cluster ids using a dict comprehension and map it to "Name" column:

index_dic = {name: i for i, sublist in enumerate(index_list) for name in sublist}
df['cluster_id'] = df['Name'].map(index_dic)

Output:

    Name  Age  cluster_id
0    tom   10           2
1   nick   15           1
2   juli   14           0
3   test   14           2
4  test1   12           0
5  test1   14           0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.