Why does difflib.get_close_matches throw the "list index out of range" Error when no matches are found in the following example?
from pandas import DataFrame
import difflib
df1 = DataFrame([[1,'034567','Foo'],
[2,'1cd2346','Bar']],
columns=['ID','Unit','Name'])
df2 = DataFrame([['SellTEST','0ab1234567'],
['superVAR','1ab2345']],
columns=['Seller', 'Unit'])
df2['Unit'] = df2['Unit'].apply(lambda x: difflib.get_close_matches(x, df1['Unit'])[0])
df1.merge(df2)
I get that the value in df1 is way off - but I wouldn't expect this to error like it does, I would expect it to simply not match.
difflibis returning no close matches, which is an empty list. Then you blindly deference it, assuming there is a match, and there isn't. your lambda, instead of just deferencing[0], needs to check for a length first. What do you want to be there for no matches?