0

I'm trying to figure out the regex for the following:

String</td><td>[number 0-100]%</td><td>[number 0-100]%</td><td>String</td><td>String</td>

Also, some of these td tags may have style attributes at some point. I tried this:

String<.*>

and that returned

String</td>

but trying

String<.*><.*>

returned nothing. Why is this?

4
  • what language are you using for the regex? java? Commented Sep 9, 2010 at 6:12
  • PHP, but that shouldn't matter, should it? Commented Sep 9, 2010 at 6:23
  • It does, because some programming languages use different regular expression syntaxes. Commented Sep 9, 2010 at 6:26
  • README_FIRST: stackoverflow.com/questions/1732348/… Commented Oct 1, 2012 at 10:19

4 Answers 4

2

You probably shouldn't be trying to use a regex to parse HTML, because that way lies madness.

Sign up to request clarification or add additional context in comments.

Comments

1
(.+)</td><td>(1?\d?\d)%</td><td>(1?\d?\d)%</td><td>(.+)</td><td>(.+)</td>

1 Comment

This is good, but the tags won't always be <td>, sometimes they will have attributes and say <td style=....>
1

use Character class, like <td[^>]*> if <td> or <td class="abc">

Comments

1

Try the following:

(.+)(<[^>]+>){2}(1?\d?\d)%(<[^>]+>){2}(1?\d?\d)%(<[^>]+>){2}(.+)(<[^>]+>){2}(.+)<[^>]+>

You can test it here.

EDIT: Although this will work for most of the time, if there is > character in one attribute of the tag, this regex won't work.

1 Comment

> is allowed in an attribute value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.