2

I using lazyload script for iframes I need to make a preg_replace code to change src to data-src.

I try something like this but I failed:

$cache = preg_replace('%<iframe.*?src=["\'](.*?)["\'].*?/?>%i', 'data-src="$1"', $content);

My code is only print data-src="the link" without the full iframe code.

0

1 Answer 1

5

New Answer that uses legitimate DOM parsering functions to reliably mutate valid html:

  • Iterate all iframe tags.
  • Insert the new data-src attribute using the existing src attribute.
  • Remove the old src attribute.
  • Print the updated DOM.

As mentioned by @user706420, removing the src attribute from the <iframe> tag is a bad decision because it renders the html invalid. My answer is demonstrative on how to execute a tag attribute replacement generally, but I agree with @user706420 that this task does seem logically flawed.

Code: (Demo)

$html = <<<HTML
<p>Some random text <iframe src="the link"" width="425" height="350" frameborder="0"></iframe></p>
HTML;

libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
foreach ($dom->getElementsByTagName('iframe') as $iframe) {
    $iframe->setAttribute('data-src', $iframe->getAttribute('src'));
    $iframe->removeAttribute('src');
}
echo $dom->saveHTML();

Output:

<p>Some random text <iframe width="425" height="350" frameborder="0" data-src="the link"></iframe></p>

Old Answer (improved on Oct 9, 2020) with advice that I no longer endorse because regex is "DOM-ignorant"...

Match the start of the <iframe and all characters within the opening tag until you encounter a space character which is followed immediately by the substring src= -- this ensures that the targeted src= substring doesn't have any preceding non-white-space characters (IOW, it is a whole/solitary word).

The substring before the space must be released/forgotten -- this is what \K does. The space will need to be consumed and replaced with data-.

Code: (Demo)

$content = 'Some text that contains src <iframe src="www.example.com"/> Some text';
echo preg_replace('~<iframe[^>]*\K (?=src=)~i', ' data-', $content);

Output:

Some text that contains src <iframe data-src="www.example.com"/> Some text

Although I have improved the regex, valid html strings can see be intentionally written to break the regex like: <iframe src="www.example.com"/ data-type="<iframe" data-whoops=" src= oh my"> For this reason, I ONLY recommend parsing html with a dom parser.

Sign up to request clarification or add additional context in comments.

5 Comments

Here is a bug. Youtube urls like this /embed/CsRcybaSXLc have "sRc" in it. Your code catch it and replace it as /embed/Cdata-sRcybaSXLc Maybe should be (?=src=) Test here regex101.com/r/DHlZCC/1
This answer is TERRIBLE and in no way provides the advice that I currently give. I will be rewriting this answer today to employ a DOM parser and legitimate DOM manipulating function calls. Thank you for blowing your whistle. I'm rather embarrassed.
@user706420 I have updated my answer. Thanks again.
Nice update. :) The old answer still has the error. I use here (?=src=) ;) I use this code for images '~<img[^>]*\K(?=src=)~i', 'src="data:," data-' Remove src attribute is for <img> not a good choice, because the html is not valid without src attribute. So instead remove/replace I replace it with valid html 'src="data:," data-x'
@user I have improved the regex pattern beyond your suggestion, then intentionally wrote a html string that intentionally breaks the pattern to demonstrate that regex should not be trusted to parse html.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.