I'm doing the below in Python
I have a problem that involves checking a large list(size n) of input strings to see if the substrings contain any words from a large dictionary(size m).
I've looked around for efficient algorithms for this problem and found these: https://github.com/laurentluce/python-algorithms/blob/master/algorithms/string_matching.py
the Rabin-Karp and KMP matching algorithms.(note that I've replaced the ord() function for the Rabin-Karp with a dictionary for efficiency)
However, these actually perform slower than using an 'in' operation in Python which uses the Boyer–Moore–Horspool algorithm. I suppose that this is because the contains() method invoked by 'in' was is implemented in C.
How can I override this method with the Rabin-Karp for the string class in Python in C?