I have a regular expression in C#, where the pattern is 8000+ words (or groups of words) each separated by word boundaries, i.e.:
"\\bword1\\b|\\bword2 word3 word4\\b|.......etc"
I am trying to match a word (or group of words) in a string to any word (or group of words) in this expression. It all works fine, except I find that on average it takes 37ms to complete the operation.
Interestingly, if I do the same thing but using String.IndexOf and some convoluted methods it does run substantially quicker (but still far too slow), which I find odd.
I am aware of other regular expression engines in particular re2/google but am really keen to use C# built in functionality where possible.
If anyone has advice it would be appreciated.