3

i have an array of urls

[
  'http://www.example.com/eng-gb/products/test-1',
  'http://www.example.com/eng-gb/products/test-3',
  'http://www.example.com/eng-gb/about-us',
]

I need to write a regex for filter only the ones end with:

http://www.example.com/eng-gb/products/(.*)

in this case i need to exclude 'about-us'.

I need also use 'http://www.example.com/eng-gb/products/(.*)' as regex.

The best way for archive?

0

3 Answers 3

1

preg_grep() provides a shorter line of code, but because the substring to be matched doesn't appear to have any variable characters in it, best practice would indicate strpos() is better suited.

Code: (Demo)

$urls=[
  'http://www.example.com/eng-gb/products/test-1',
  'http://www.example.com/eng-gb/badproducts/test-2',
  'http://www.example.com/eng-gb/products/test-3',
  'http://www.example.com/eng-gb/badproducts/products/test-4',
  'http://www.example.com/products/test-5',
  'http://www.example.com/eng-gb/about-us',
];

var_export(preg_grep('~^http://www.example\.com/eng-gb/products/[^/]*$~',$urls));
echo "\n\n";
var_export(array_filter($urls,function($v){return strpos($v,'http://www.example.com/eng-gb/products/')===0;}));

Output:

array (
  0 => 'http://www.example.com/eng-gb/products/test-1',
  2 => 'http://www.example.com/eng-gb/products/test-3',
)

array (
  0 => 'http://www.example.com/eng-gb/products/test-1',
  2 => 'http://www.example.com/eng-gb/products/test-3',
)

Some notes:

Using preg_grep():

  • Use a non-slash pattern delimiter so that you don't have to escape all of the slashes inside the pattern.
  • Escape the dot at .com.
  • Write the full domain and directory path with start and end anchors for tightest validation.
  • Use a negated character class near the end of the pattern to ensure that no additional directories are added (unless of course you wish to include all subdirectories).
  • My pattern will match a url that ends with /products/ but not /products. This is in accordance with the details in your question.

Using strpos():

  • Checking for strpos()===0 means that the substring must be found at the start of the string.
  • This will allow any trailing characters at the end of the string.
Sign up to request clarification or add additional context in comments.

Comments

0

I think you need use preg_grep cause you have array of urls and this will return array of url that match your condition

$matches = preg_grep('/products\/.*$/', $urls);

and also you can use validate filters in php to validate urls

Comments

0

You'll need to escape the forward slashes and periods to get http:\/\/www\.example\.com\/eng-gb\/products\/(.*). After that, you could just place the URL in directly.

Alternatively (better) would be to search for \/eng-gb\/products\/(.*).

Example:

$matches = array();
preg_match('/\/eng-gb\/products\/(.*)/', $your_url, $matches);
$product = $matches[1];

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.