preg_replace expression can't include string

Question

Being driven up the wall here. I've looked at other posts where there's a regex negative lookahead but for some reason I can't get it to work. Probably missing something very easy but after a couple of hours trying various options I need help!

So, I'm trying to find a pattern for preg_replace which searches through code for href links which IGNORES any containing a particular domain AND also IGNORES any which include a js reference called data-fancybox.

In the following it must ignore the first 3 which contain data-fancybox and also ignore #4 the youtube link. It should only find the last 2.

<a href="https://youtube.com/" data-fancybox>
<a href="https://example.com/" data-fancybox>
<a href="https://vimeo.com/" data-fancybox>
<a href="https://youtube.com/">
<a href="https://example.com/">
<a href="https://vimeo.com/">

When I try this:

<a href=.*((youtube|data-fancybox)).*>

It picks out the first 4 and ignores the last two. But when I try to turn this negative so it only picks out the last 2, it ends up picking them all out:

<a href=.*(?!(youtube|data-fancybox)).*>

Any help appreciated!!

maybe this could help regular-expressions.info/lookaround.html — Antonio Abrantes
– Antonio Abrantes, Commented Apr 18, 2020 at 15:40
I can't user '? if (!preg_match($pattern,$string)):' because the preg_match needs to go through a large block of HTML text in which are various href links; it needs to pick out certain links and adjust them before returning the block of HTML — arathra
– arathra, Commented Apr 19, 2020 at 17:40

Antonio Abrantes · Accepted Answer · 2020-04-26 20:52:34Z

0

You should specify all characters before of occurrence of the text that you are interested youtube.com for instance. To detect links that are not youtube.com

$pattern = '/(<a href=\"https:\/\/(?!youtube.com).*)/';
$string[] = '<a href="https://youtube.com/" data-fancybox>';
$string[] = '<a href="https://example.com/" data-fancybox>';
$string[] = '<a href="https://vimeo.com/" data-fancybox>';
$string[] = '<a href="https://youtube.com/">';
$string[] = '<a href="https://example.com/">';
$string[] = '<a href="https://vimeo.com/">';
foreach ($string as $key=>$str)
{
    if (preg_match($pattern, $str, $matches))
        echo "$key valid<BR>";
    else
        echo "$key not valid<BR>";
}

To detect link that are not data-fancybox

$pattern = '/(.*\" (?!data-fancybox)|.*\">)/';

edited Apr 26, 2020 at 20:52

answered Apr 18, 2020 at 14:06

Antonio Abrantes

5815 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

14 Comments

arathra Over a year ago

That's a step closer; but does this mean it's not possible though to have a pattern which negates both 'youtube' OR 'data-fancybox' then as we can't specify the string which occurs before 'data-fancybox'?

Antonio Abrantes Over a year ago

when you write something like q(?!youtube.com) you are asking for a q character that is not followed by youtube.com. As far as I know, you always needs to specify this character, in this case q. The problem with your pattern is that you don' specify a character you simple say .* there are several characters that are not followed by youtube.com that is the reason all sentences are valid

Antonio Abrantes Over a year ago

to detect data-fancibox you could try $pattern = '/.*\" (data-fancybox).*/';

Antonio Abrantes Over a year ago

to detect with no data-fancibox: $pattern = '/(.*\" (?!data-fancybox)|.*\">)/';

Antonio Abrantes Over a year ago

doing an OR with both results: $pattern = '/<a href=\"https:\/\/(?!youtube.com).*|(.*\" (?!data-fancybox)|.*\">)/';

|

Collectives™ on Stack Overflow

preg_replace expression can't include string

1 Answer 1

14 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

14 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related