I have 2 lists. Actual and Predicted. I need to compare both lists and determine the number of fuzzy matches. The reason I say fuzzy matches is due to the fact that they will not be the exact same. I am using the SequenceMatcher from the difflib library.
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
I can assume that strings with a percentage match of above 80% are considered to be the same. Example Lists
actual=[ "Appl", "Orange", "Ornge", "Peace"]
predicted=["Red", "Apple", "Green", "Peace", "Orange"]
I need a way to pick out that Apple, Peace and Orange in the predicted list has been found in the actual list. So only 3 matches have been made and not 5 matches. How do I do this efficiently?