I have a regex that works great(500 nanoseconds) when a match is found, but takes a lot of time (over 3 secs) when there is no match. I suspect this could be because of backtracking. I tried some options, like converting .* to (.*)? based on some documentation, but it didn't help.
Input: a very long string - 5k chars in some cases.
Regex to match: .*substring1.*substring2.*
I am pre-compiling the pattern and re-using the matcher, what else can I try?
Here's my code snippet - I will be calling this method with millions of different input strings, but just a handful of regex patterns.
private static HashMap<String, Pattern> patternMap = new HashMap<String, Pattern>();
private static HashMap<String, Matcher> matcherMap = new HashMap<String, Matcher>();
Here's my method:
public static Boolean regex_match(String line, String regex) {
if (regex == null || line == null) {
return null;
}
if (!patternMap.containsKey(regex)) {
patternMap.put(regex, Pattern.compile(regex));
matcherMap.put(regex,patternMap.get(regex).matcher(""));
}
return matcherMap.get(regex).reset(line).find(0);
}
substring1[^s]*(?:s(?!ubstring2)[^s]*)*substring2