3

If I have a bunch of text, let say HTML, but it doesnt have to be.

</TD> 
<TD CLASS='statusEven'><TABLE BORDER=0 WIDTH='100%' CELLSPACING=0 CELLPADDING=0><TR><TD         ALIGN=LEFT><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0> 
<TR> 
<TD ALIGN=LEFT valign=center CLASS='statusEven'><A HREF='extinfo.cgi?    type=2&host=localhost&service=Current+Load'>Current Load</A></TD></TR> 
</TABLE> 
</TD> 
<TD ALIGN=RIGHT CLASS='statusEven'> 
<TABLE BORDER=0 cellspacing=0 cellpadding=0> 
<TR> 
</TR> 
</TABLE> 
</TD> 
</TR></TABLE></TD> 
<TD CLASS='statusOK'>OK</TD> 
<TD CLASS='statusEven' nowrap>08-04-2011 22:07:00</TD> 
<TD CLASS='statusEven' nowrap>28d 13h 18m 11s</TD> 
<TD CLASS='statusEven'>1/1</TD> 
<TD CLASS='statusEven' valign='center'>OK &#45; load average&#58; 0&#46;01&#44; 0&#46;04&#44; 0&#46;05&nbsp;</TD> 

and I wanted to grab everything between 2 markers and the result is probably multiple lines, how would I do that?

Here's what I have so far....

    Pattern p = Pattern.compile("extinfo(.*)load average");
    Matcher m = p.matcher(this.resultHTML);

    if(m.find())
    {
         return m.group(1);
    }

1 Answer 1

10

Use the (?s) switch:

Pattern p = Pattern.compile("(?s)extinfo(.*?)load average")

This switch turns on "dot matches newline" for the remainder of the regular expression, which means essentially it treat the whole input a "one line" (newlines are just another character).

Without this switch, patterns won't match across a newline boundary.

Also, your regex was "greedy", so I added ? to the capture to make it "not greedy", which means it will capture enough to make the match, but no more.

Sign up to request clarification or add additional context in comments.

4 Comments

OK cool. How do I stop it from being greedy? Pattern p = Pattern.compile("(?s)<TD ALIGN=LEFT valign=center CLASS(.*)><TABLE"); isn't stopping at the first "><TABLE ", it's going to the end.
I edited the regex to not be greedy - just add ? - see answer
You can also use the Pattern.DOTALL switch as the second argument to compile. I didn't know you could specify the switch in the pattern itself; thanks for that.
Thanks Bohemian. I had my ? on the outside of my parenths. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.