1

I am trying to apply the below pattern:

Pattern p = Pattern.compile(".*?");
Matcher m = p.matcher("RAJ");
StringBuffer sb = new StringBufffer();
while(m.find()) {
 m.appendReplacement(sb, "L");
}
m.appendTail(sb);

Expected Output : LLL Actual output : LRLALJL

Does the Dot(.) in the above regex match the position between the characters? If not why is the above output received

4 Answers 4

5

The .*? matches any number of characters, but as few as necessary to match the whole regex (the ? makes the * reluctant (also known as lazy)). Since there's nothing after that in the regex, this will always match the empty string (a.k.a the place between characters).

If you want at least a single character to be matched try .+?. Note that this is the same as just . if there's nothing else after it in the regex.

Sign up to request clarification or add additional context in comments.

1 Comment

+1, but I would emphasize that reluctance vs. greediness is not the source of the problem--it just makes the symptoms more interesting. :P If you were to change the reluctant .*? to a greedy .* you'd get LL as a result, and a greedy .+ would give you simply L. As you pointed out, .+? yields the correct output, but only because it acts like .. The real solution is to get rid of the quantifier, as @rmunoz demonstrated.
3

You can get it doing this:

String s = "RAJ";
s = s.replaceAll(".","L");                                                                                                                                                                                  
System.out.println(s);

You can do it using a Matcher and find method, but replaceAll accepts a regex.

2 Comments

+1 for the only (so far) correct solution (although @Joachim did point out that .+? is effectively the same as .) and yes, replaceAll() (found in the String class as well as well as in Matcher) does the same thing as the code in the question.
Thanks, it was a simpler approach.
2

It is not that . matches between the characters, but that * means 0 or more and the ? means as few as possible.
So "Zero or more things, and as few of them as possible" will always match Zero things, as that is the fewest possible, if it's not followed by something else the expression is looking for.

.{1} would result in an output of LLL as it matches anything once.

1 Comment

The . alone will match once--or more precisely, it will consume exactly one character. The {1} pseudo-quantifier has no effect.
1

The * in your regex .*? means none or more repetitions. If you want to match at least a single character use the regex .+?.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.