4

Hi I'm starting to learn php regex and have the following problem: I need to extract the numbers inside $string. The regex I use returns "NULL".

$string = 'Clasificación</a> (2194)  </li>';
$regex = '/Clasificación</a>((.*?))</li>/';
preg_match($regex , $string, $match);
var_dump($match);

Thanks in advance.

3 Answers 3

6

There are three problems with your regex:

  • You aren't escaping the forward slash. You're using the forward slash as a delimiter, so if you want to use it as a literal character inside the expression, you need to escape it
  • ((.*?)) doesn't do what you think it does. It creates two capturing groups -- one nested inside the other. I assume, you're trying to capture what's inside the parentheses. For that, you'll need to escape the ( and ) characters. The expression would become: \((.*?)\)
  • Your expression doesn't handle whitespace. In the string you've given, there is whitespace between the </a> and the beginning of the number -- </a> (2194). To ignore the whitespace and capture just the number, you need to use \s (which matches any whitespace character). For that, you need to write \s*\((.*?)\)\s*.

The final regular expression after fixing all the above errors, will look like:

$regex = '~Clasificación</a>\s*\((.*?)\)\s*</li>~';

Full code:

$string = 'Clasificación</a> (2194)  </li>';
$regex = '~Clasificación</a>\s*\((.*?)\)\s*</li>~';
preg_match($regex , $string, $match);
var_dump($match);

Output:

array(2) {
  [0]=>
  string(32) "Clasificación (2194)  "
  [1]=>
  string(4) "2194"
}

Demo.

Sign up to request clarification or add additional context in comments.

Comments

2

You forget to espace / in your regex, since you're using the / as a delimiter:

$regex = '/Clasificación<\/a>((.*?))<\/li>/';
//        ^ delimiter    ^^               ^ delimiter
//                       ^^ / in a string which is escaped

Another way can be to change that delimiter, and then you will not have to escape it:

$regex = '#Clasificación<\/a>((.*?))<\/li>#';

See the PHP documentation for more information.

Comments

2

you will have to escape out the special characters that you want to match:

$regex = '/Clasificación<\/a> \((.*?)\) <\/li>/'

and may want to make your match a little more specific where it matters (depending on your use case)

$regex = '/Clasificación<\/a>\s*\(([0-9]+)\)\s*<\/li>/'; 

that will allow for 0 or more spaces before or after the (1234) and only match if there are only numbers in the ()

I just tried this in php:

php > preg_match($regex , $string, $match);
php > var_dump($match);
array(2) {
  [0]=>
  string(30) "Clasificacin</a> (2194)  </li>"
  [1]=>
   string(4) "2194"
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.