Consider the following PHP snippet:
<?php
$html = <<<DATA
<p>Lorem Ipsum is simply dummy text</p> <p>Lorem Ipsum is <a href="http://www.google.com">simply</a> dummy text</p><a href="http://www.youtube.com/watch?v=DUQi_R4SgWo" target="_blank" rel="noopener">Check out the video here!</a>. <p>Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.</p> <a href="http://www.youtube.com/watch?v=A_6gNZCkajU" target="_blank" rel="noopener">Video here</a> <p>It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>
DATA;
# set up the DOM
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
# set up the xpath
$xpath = new DOMXPath($dom);
# set up the regex
$regex = '~\?v=([^&]+)~';
foreach ($xpath->query("a[contains(@href, 'youtube')]/@href") as $link) {
preg_match($regex, $link->nodeValue, $matches);
if ($matches) {
$id = $matches[1];
echo "$id\n";
}
}
?>
This sets up the DOM on an HTML string and gets the YouTube links with the help of an xpath query and a regular expression afterwards.
The snippet yields
DUQi_R4SgWo
A_6gNZCkajU
Now, I'd like to replace the
foreach loop with:
$regex = '~\?v=([^&]+)~';
$xpath->registerPHPFunctions();
$xpath->registerNamespace("php", "http://php.net/xpath");
$links = $xpath->query("a[php:functionString('preg_match', '$regex', href, '$matches')]/@href");
This finds the same links but does not save anything into $matches - why?
PHP 5.6.30) - please add your snippet as an answer though, it comes very near.