Java regexp complicated pattern

Question

I have a string like this (made from HTML source code):

<tr>
  <td>
    <tr>First</tr>
  </td>
</tr>
<tr>
  <td>Second</td>
</tr>
<tr>
  <td>
    <tr>
      <td>Upper</td>
    </tr>
    <tr>
      <td>Lower</td>
    </tr>
  </td>
</tr>

but in one line - I divided it to make it look better. What I want to achieve is a regular expression that will capture whole rows of this table, so the matches are:

<td>
  <tr>First</tr>
</td>

,

<td>Second</td>

,

<td>
  <tr>
    <td>Upper</td>
  </tr>
  <tr>
    <td>Lower</td>
  </tr>
</td>

The most simple options:

<tr>.*</tr> - catches everything
<tr>.*?</tr> - catches from the first <tr> to the first </tr>.

I want it to catch corresponding tags. Can anybody help?

Use an HTML parser to parse HTML. And in future, please review the preview carefully before posting. — Andrew Thompson
– Andrew Thompson, Commented Jun 13, 2013 at 12:30
Relevant: stackoverflow.com/questions/238036/java-html-parsing — ohaal
– ohaal, Commented Jun 13, 2013 at 12:32
You should not use a regex for parsing HTML. This answer provides a fantastic explanation why you shouldn't do it: stackoverflow.com/questions/1732348/… — gkalpak
– gkalpak, Commented Jun 13, 2013 at 12:32
@AndrewThompson I think SO should put this link in the face of the user if it sees regex and html as a tag combination ;) — fge
– fge, Commented Jun 13, 2013 at 12:34

Ro Yo Mi · Accepted Answer · 2013-06-13 13:07:33Z

1

You could use html parsing engine jsoup and run something like this to pull out rows from your table

String url = "a.html";
Document doc = Jsoup.connect(url).get();

Elements rows = doc.select("table tr");

answered Jun 13, 2013 at 13:07

Ro Yo Mi

15k5 gold badges38 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Java regexp complicated pattern

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related