@Hesham Attia's answer is simple enough to resolve your problem, just a little bit more explanation about how it works differently to your original pattern.
Let's add the index i to the matched group to the code:
public static void main(String[] args) throws IOException {
String word = " Some random mobile numbers 0546 105 610, 451 518 9675, 54 67892 541";
word = word.replaceAll("\\s+", "");
Pattern pat = Pattern.compile("(\\d{10})");
Matcher mat = pat.matcher(word);
while (mat.find()) {
for (int i = 1; i <= mat.groupCount(); i++) {
System.out.println("Group-" + i + ": " + mat.group(i));
}
}
}
and you'll get the result:
Group-1: 0546105610
Group-1: 4515189675
Group-1: 5467892541
And the result of your pattern is:
Group-1: 0546105610
Group-2: 4515189675
Group-3: 5467892541
Actually the above code with new pattern "(\\d{10})" is equivalent to the following:
public static void main(String[] args) throws IOException {
String word = " Some random mobile numbers 0546 105 610, 451 518 9675, 54 67892 541";
word = word.replaceAll("\\s+", "");
Pattern pat = Pattern.compile("\\d{10}");
Matcher mat = pat.matcher(word);
while (mat.find()) {
System.out.println(mat.group());
}
}
If you refer to the javadoc of Matcher.find(), Matcher.group(), Matcher.groupCount(), you'll find out method Matcher.find() try to find the next matched substring of given pattern, Matcher.group() returns the previous match, and Matcher.groupCount() does not include the entire match(which is group 0), only the capturing groups specified in your pattern.
Simply speaking, the way regex engine works is that it will walk through your pattern with the subject subsequence and try to match as much as possible(greedy mode), now let's talk about the differences between those patterns:
Your original pattern: ".*(\\d{10}+).*"+".*(\\d{10}+).*"+".*(\\d{10}+).*" and why you need repeat it three times
If only ".*(\\d{10}+).*" is given, the pattern will match the whole string, the matching parts is:
- "Somerandommobilenumbers" matches heading
.*
- "0546105610" matches
\\d{10}+ and goes to group 1
- ",4515189675,5467892541" matches tailing
.*
The entire string has already been used for the first attempt and there's nothing left for the pattern to match again, you just have no way to extract the 2nd and 3rd number out, so you need to repeat your pattern to put them into separated groups.
Pattern "(\\d{10})":
It'll match one number sequence each time you call mat.find(), put it into group 1 and return, then you can extract the result from group 1, that's why the index of group is always 1
Pattern "\\d{10}":
The same with Pattern 2, but will not put the matching result to the group 1, so you can get the result from mat.group() directly, actually it's group 0.