0

I have a simple regexp question. I have the following multiline string:

description: line1\r\nline2\r\n...

And I am trying to find all the lines that come after the description:. I used the following regexp (and few more):

description: ((.*\r\n){1,})

...without any success. Then I found that there is a 'Regexp StackOverflow' bug (stated as won't fix) in Sun, see Bug #5050507. Can anyone please provide me with the magic formula to overcome this annoying bug? Please note that the total length of the lines must exceed 818 bytes!!

1
  • 3
    Try description:\s((?:[^\r\n]*+\r\n)++) Commented Sep 19, 2010 at 17:47

2 Answers 2

1

Since you are matching anything beyond the text description, you can simply allow the dot to match newlines with Pattern.DOTALL:

description:\s(.*)

So, in Java:

Pattern regex = Pattern.compile("description:\\s(.*)", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
    ResultString = regexMatcher.group(1);
}

The only semantic difference to your regex (apart from the facts that it won't blow your stack) is that it would also match if whatever follows after description: does not contain a newline. Also, your regex will not match the last line of the file unless it ends in a newline, mine will. Which behaviour is preferable is your decision.

Of course, your functionality could be emulated like this:

description:\s(.*\r\n)

but I doubt that that's really what you want. Or is it?

Sign up to request clarification or add additional context in comments.

2 Comments

I think the OP took that "818" number from the bug report he cited--i.e., he's just saying the strings he's working with will always be long enough to trigger this behavior.
@Alan Moore: Oh, OK. That wasn't clear at all from his question, though :)
0

I can reproduce the error:

StringBuilder sb = new StringBuilder();
for (int i = 0; i < 1000; ++i)
{
    sb.append("j\r\n");
}
String s = "description: " + sb.toString(); 
Pattern pattern = Pattern.compile("description: ((.*\r\n){1,})");
//Pattern pattern = Pattern.compile("description: ((?:.*\r\n)++)");

Matcher matcher = pattern.matcher(s);
boolean b = matcher.find();
if (b) {
    System.out.println(matcher.group(1));
}

The quantifier {1,}is the same as + so you should use + instead, but this still fails. To fix it you can (as Bat K. points out) change the + to ++ making it possessive, which disables backtracking, preventing the stack overflow.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.