2

I want the value of the rel attribute of the anchor tag associated with the search domain.

I have to change the domain "blog.zeit.de/berlinjournal" instead of "http://blog.zeit.de/berlinjournal/". Use this domain and find out rel Val

@Sam Onela, code not working for this domain. Please help me to solve this error.

My code is:

$domain = 'blog.zeit.de/berlinjournal';
$handle = fopen($domain, 'r');
$content = stream_get_contents($handle);
fclose($handle);
if ((strpos($content, $domain) !== false)) {
        echo 'true'; // true if $domain found in view source content
} 

Get the clear idea in blow image

enter image description here

7
  • is this your site? if not and you are allowed to scrape it, just use DOMDocument Commented Apr 11, 2017 at 5:40
  • @Ghost, Is not my site. ok, thank you ... Commented Apr 11, 2017 at 5:49
  • @Angel it appears that the HTML of the page changed so that link no longer exists. Is there a different link you can target? Commented Apr 26, 2017 at 16:14
  • Hmm I don't see any anchor tags with just the single attribute rel="nofollow" ... but I do see <a href='http://rballutschinski.wordpress.com/' rel='external nofollow' class='url'>Ruben Ballutschinski</a> ... do you want to find that one? Commented Apr 26, 2017 at 16:24
  • @SamOnela, I want rel="nofollow" value, problem is that when I search domain in HTML view source without "http://" then It will return nothing, but with http:// it will work fine. Commented Apr 27, 2017 at 4:20

1 Answer 1

1

Create an instance of DOMDocument, call the loadHTML() method, then use simplexml_import_dom() to get an instance of a SimpleXMLElement, on which the xpath() method can be used to query for that anchor tag.

You may also notice warnings printed to the screen when loading the html. To set it to use the internal error handler, use libxml_use_internal_errors(true); - thanks to @dewsworld for this answer.

libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($content);
$xml = simplexml_import_dom($doc);
$results = $xml->xpath("//a[@href='$domain']");
if (sizeof($results)) {
    echo 'rel: '.$results[0]['rel'].'<br>';
}

See it demonstrated in this phpfiddle.

Update

Since the HTML of the original URL has changed and the requirement is now to find the rel attribute of a different anchor tag, that can be done with the contains() xpath function.

$searchDomain = 'rballutschinski.wordpress.com/';
if ((strpos($content, $searchDomain) !== false)) {
    $doc = new DOMDocument();
    $doc->loadHTML($content);
    $xml = simplexml_import_dom($doc);
    $results = $xml->xpath("//a[contains(@href,'$searchDomain')]");
    if (sizeof($results)) {
        $rel = $results[0]['rel'];
    }

See a demonstration in this phpfiddle.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @Sam Onela
your code is working well bt when I m using my $searchdomin = paintyourlife.com/watercolor-portraits.php and $domain = messhall.org/… then it will give me an error.. I put my code in pastebin.com/j4E7nAYQ plz help me to solve this error.. Thank you
if you are seeing another error that you are not familiar with, and don't find any documentation on it, then perhaps ask another question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.