I have a dataframe (FinalDF) which looks like this
id | Movie | Cast
0 The Dark Knight Christopher Nolan
1 The Dark Knight Christian Bale
2 Pulp Fiction Quentin Tarantino
3 Pulp Fiction John Travolta
4 Schindler’s List Steven Spielberg
5 Schindler’s List Liam Neeson
and Movie names are mapped to IDs like this in movie_cast_DF
id | name | uuid
-------------------------
1 | The Dark Knight | m1
2 | Pulp Fiction | m2
3 | Schindler’s List | m3
4 | Christopher Nolan | d1
5 | Christian Bale | a1
6 | Quentin Tarantino | d2
7 | John Travolta | a2
8 | Steven Spielberg | d3
9 | Liam Neeson | a3
I need to map the ids in the columns like this in FinalDF
id | Movie | Cast | mid | cid
------------------------------------------------------------------
0 The Dark Knight Christopher Nolan m1 d1
1 The Dark Knight Christian Bale m1 a1
2 Pulp Fiction Quentin Tarantino m2 d2
3 Pulp Fiction John Travolta m2 a2
4 Schindler’s List Steven Spielberg m3 d3
5 Schindler’s List Liam Neeson m3 a3
I tried using following method:
def getID(x):
try:
return movie_cast_DF[movie_cast_DF['name'].str.contains(x.lower(), case=False)]['uuid'].values[0]
except:
return None
FinalDF['mid'] = FinalDF['Movie'].apply(getID)
FinalDF['cid'] = FinalDF['Cast'].apply(getID)
FinalDF.head()
Is there any efficient and faster way to do the mapping?