I have a text something like
ab1ab2ab3ab4cd
Can one create a java regular expression to obtain all subtrings that start with "ab" and end with "cd"? e.g:
ab1ab2ab3ab4cd
ab2ab3ab4cd
ab3ab4cd
ab4cd
Thanks
The regex (?=(ab.*cd)) will group such matches in group 1 as you can see:
import java.util.regex.*;
public class Main {
public static void main(String[] args) throws Exception {
Matcher m = Pattern.compile("(?=(ab.*cd))").matcher("ab1ab2ab3ab4cd");
while (m.find()) {
System.out.println(m.group(1));
}
}
}
which produces:
ab1ab2ab3ab4cd
ab2ab3ab4cd
ab3ab4cd
ab4cd
You need the look ahead, (?= ... ), otherwise you'll just get one match. Note that regex will fail to produce the desired results if there are more than 2 cd's in your string. In that case, you'll have to resort to some manual string algorithm.
Looks like you want either ab\w+?cd or \bab\w+?cd\b
find with an offset just beyond the start of the last match, but Bart's solution is cleaner./^ab[a-z0-9]+cd$/gm
If only a b c and digits 0-9 can appear in the middle as in the examples:
/^ab[a-c\d]+cd$/gm
See it in action: http://regexr.com?2tpdu