I have to execute an script within a function where 2 dataframes are being used.
When working separately it works well but i am not getting how to use the function where we have to deal with 2 dataframes.
Need Suggestion
df1 = pd.read_excel(open(r'input.xlsx', 'rb'), sheet_name='sheet1')
df2 = pd.read_excel(open(r'input.xlsx', 'rb'), sheet_name='sheet2')
from fuzzywuzzy import fuzz
cross = df1[['id_number']].merge(df2[['identity_no']], how='cross')
cross['match'] = cross.apply(lambda x: fuzz.ratio(x.id_number, x.identity_no), axis=1)
df1['match_acc'] = df1.id_number.map(cross.groupby('id_number').match.max())
I need to execute the above script within a function.
I have tried using the below code but not getting how a function can be used where we have to use 2 dataframes.
def word(x,y):
try:
cross = x[['id_number']].merge(y[['identity_no']], how='cross')
cross['match'] = cross.apply(lambda x: fuzz.ratio(x.id_number, x.identity_no), axis=1)
x['match_acc'] = x.id_number.map(cross.groupby('id_number').match.max())
return ValueError:
x['status'] = ValueError
return x
df = df.apply(word, axis=1)
Please Suggest.
Try / returndoes not exist: replace byTry / except