4

From a string, I need to pull out groups that match a given pattern.

An example string: <XmlLrvs>FIRST</XmlLrvs><XmlLrvs>SECOND</XmlLrvs><XmlLrvs>Third</XmlLrvs>

Each group shall begin with <XmlLrvs> and end with </XmlLrvs>. Here is a snippet of my code...

String patternStr = "(<XmlLrvs>.+?</XmlLrvs>)+";

// Compile and use regular expression
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(text);
matcher.matches();

// Get all groups for this match
for (int i = 1; i<=matcher.groupCount(); i++) {
   System.out.println(matcher.group(i));
}

The output is <XmlLrvs>Third</XmlLrvs>. I am expecting group first and second but those aren't being captured. Can anyone assist?

2 Answers 2

8

You are iterating over the groups when you should be iterating over matches. The matches() method checks the entire input for a match. What you want is the find() method.

Change

matcher.matches();

for (int i = 1; i<=matcher.groupCount(); i++) {
    System.out.println(matcher.group(i));
}

to

while (matcher.find()) {
    System.out.println(matcher.group(1));
}
Sign up to request clarification or add additional context in comments.

4 Comments

Note that the + in the regex needs to be removed, or everything will be matched at once, and not in three iterations.
I don't agree with that, the .+? is an non-greedy quantifier. But I haven't tested it.
Removing the + at the tail of the expression and using the while control statement suggested did just the job. Thanks
@molf: Right you are, didn't see that!
0

Try it
String patternStr = "<XmlLrvs>(.*?)</XmlLrvs>";
String text = "<XmlLrvs>FIRST</XmlLrvs><XmlLrvs>SECOND</XmlLrvs><XmlLrvs>Third</XmlLrvs>";
Pattern pattern = Pattern.compile(patternStr);

Matcher matcher = pattern.matcher(text);

while (matcher.find()) {
System.out.println(matcher.group(1));
}

The output is FIRST,SECOND,Third

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.