I would like to match couple of words in a text. Have following:
if ( Pattern.matches(".*\\b" + placeSub.toLowerCase() + "\\b" + placeSup.toLowerCase() + "\\b.*", sourceText.toLowerCase()) ) {
System.out.println( String.format("Matched %s on %s", placeSub, placeSup) );
}
The variables placeSub, placeSup & sourceText are dynamic (runtime).
The code above doesn't work (no match). However, the following matches:
if ( Pattern.matches(".*\\b" + placeSub.toLowerCase() + "\\s" + placeSup.toLowerCase() + "\\b.*", sourceText.toLowerCase()) ) {
System.out.println( String.format("Matched %s on %s", placeSub, placeSup) );
}
Why is the text able to match \\s and not \\b?
Example input:
placeSub :
SouthplaceSup :
SudansourceText :
tens of thousands of people have fled fierce fighting in south sudan's northern unity state
placeSub=SouthandplaceSup=Sudanthere can't be just a\bbetween the two. In the example there is, however, a space, which is why\smatches..*there?South Sudanis just an example. It would beSouth-Sudanor something else./word\b.*\bword/.*matches anything before and anything after. I could have used^&$but preferred to leave the expression open.