1

I wrote a regex pattern which works perfectly when I test it in Regexr, but when I use it in my PHP code it doesn't always match when it should match.

The regular expression, including some examples that should and shouldn't match.

Example PHP code that should match but doesn't:

preg_match('/^([~]{3,})\s*([\w-]+)?\s*(?:\{([\w-\s]+)\})?\s*(\2[\w-]+)?\s*$/', "~~~ {class} lang", $matches);
echo var_dump($matches);

I believe the problem is caused by the backreference in the last capture group (\2[\w-]+), however, I can't quire figure out how to fix this.

3 Answers 3

3

Because you're referring to a non-existing group(group 2). So remove \2 from the regex.

^([~]{3,})\s*([\w-]+)?\s*(?:\{([-\w\s]+)\})?\s*([\w-]+)?\s*$

DEMO

    ~~~  {class} lang
     |  |   |      |
  Group1| Group3 Group4
        |
Missing group 2
Sign up to request clarification or add additional context in comments.

2 Comments

I like, how you illustrate it :)
Thanks, you and @hwnd helped me figure it out.
2

The problem is caused by capturing group #2, you have made this group optional. So since it may or may not exist, you need to make your backreference optional as well or else it always looks for a required group.

However, since all groups are optional I would just recurse the subpattern of the second group.

^(~{3,})\s*([\w-]+)?\s*(?:{([^}]+)})?\s*((?2))?\s*$

Example:

$str = '~~~ {class} lang';
preg_match('/^(~{3,})\s*([\w-]+)?\s*(?:{([^}]+)})?\s*((?2))?\s*$/', $str, $matches);
var_dump($matches);

Output

array(5) {
  [0]=> string(16) "~~~ {class} lang"
  [1]=> string(3) "~~~"
  [2]=> string(0) ""                   # Returns "" for optional groups that dont exist
  [3]=> string(5) "class"
  [4]=> string(4) "lang"
}

4 Comments

The problem is that this way '~~~ lang {class} lang' gives a match as well, and it shouldn't.
I just solved it my changing the 2nd capturing group to ([\w-]*), making it an empty string when it's not there instead of non-existent.
Tell me what it should exactly match and I'll take a look when I get back home
I've already fixed the problem thanks to your help. I've posted an answer myself explaining everything :)
0

The answers below helped me figure out why it wasn't working. However both the answers would give a positive match for $str = '~~~ lang {class} lang'; which I didn't want. I fixed it my changing capturing group 2 to ([\w-]*) so that even if there is no string at that place, the capturing group exists but remains empty. This way all of the following strings match:

$str = '~~~   lang      {no-lines float left}   ';
$str = '~~~     {class}   ';
$str = '~~~ lang';
$str = '~~~ {class } lang ';
$str = '~~~';
$str = '~~~lang{class}';

But this one won't:

$str = '~~~ css {class} php';

Full solution:

$str = '~~~ {class} lang';
preg_match('/^([~]{3,})\s*([\w-]*)?\s*(?:\{([\w-\s]+)\})?\s*(\2[\w-]+)?\s*$/', $str, $matches);
var_dump($matches);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.