I would like to create another column in a dataframe.
The dataframe is like the following, sub_id is part of the id, say id is the 'parent' for sub_id, it includes id itself and some items included in id.
id has no name but sub_id has corresponding name
I would like to check id with sub_id's name, and then create id's name
df = pd.DataFrame({'id':[1,1,1,2,2],
'sub_id':[12,1,13,23,2],
'name':['pear','fruit','orange','cat','animal']})
id sub_id name
0 1 12 pear
1 1 1 fruit
2 1 13 orange
3 2 23 cat
4 2 2 animal
I would like to create another column id_name, to get:
id sub_id name id_name
0 1 12 pear fruit
1 1 1 fruit fruit
2 1 13 orange fruit
3 2 23 cat animal
4 2 2 animal animal
I have no idea how it could be achieved efficiently, I only think of to merge the dataframe twice, but I think there is a better way.