1

I've got a regular expression that match everything between <anything> and I'm using this:

'@<([\w]+)>@'

today but I believe that there might be a better way to do it?

/ Tobias

2
  • What do you mean by "anything"? Your regex implies you mean word characters. If so the only thing you can do is to omit square brackets. they are useless there Commented Dec 1, 2010 at 13:44
  • Also note, that the regex will match like you said, and if you are matching html/xml, it would match against <div attr="my>value"> as <div attr="my>, and not the whole tag like you want. RegEx does not do quote balancing. This is ok if you are doing something specific when you know the output, but bad if you are doing something generic. Commented Dec 1, 2010 at 15:55

3 Answers 3

1

\w doesn't match everything like you said, by the way, just [a-zA-Z0-9_]. Assuming you were using "everything" in a loose manner and \w is what you want, you don't need square brackets around the \w. Otherwise it's fine.

Sign up to request clarification or add additional context in comments.

1 Comment

In PHP \w is locale dependent, so it will match 'unexpected' characters, depending on your locale settings.
1

If "anything" is "anything except a > char", then you can:

@<([^>]+)>@

Testing will show if this performs better or worse.

Also, are you sure that you need to optimize? Does your original regex do what it should?

Comments

0

You better use PHP string functions for this task. It will be a lot faster and not too complex.

For example:

$string = "abcd<xyz>ab<c>d";

$curr_offset = 0;
$matches = array();

$opening_tag_pos = strpos($string, '<', $curr_offset);

while($opening_tag_pos !== false)
{
    $curr_offset = $opening_tag_pos;
    $closing_tag_pos = strpos($string, '>', $curr_offset);
    $matches[] = substr($string, $opening_tag_pos+1, ($closing_tag_pos-$opening_tag_pos-1));

    $curr_offset = $closing_tag_pos;
    $opening_tag_pos = strpos($string, '<', $curr_offset);
}

/*
     $matches = Array ( [0] => xyz [1] => c ) 
*/

Of course, if you are trying to parse HTML or XML, use a XHTML parser instead

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.