0

Currently I have the following code:

    //loop here 
    foreach ($doc['a'] as $link) {
        $href = pq($link)->attr('href');                
        if (preg_match($url,$href))
        {
            //delete matched string and append custom url to href attr
        }       
        else
        {
            //prepend custom url to href attr
        }
    }
    //end loop

Basically I've fetched vial curl an external page. I need to append my own custom URL to each href link in the DOM. I need to check via regex if each href attr already has a base url e.g. www.domain.com/MainPage.html/SubPage.html

If yes, then replace the www.domain.com part with my custom url.

If not, then simply append my custom url to the relative url.

My question is, what regex syntax should I use and which php function? Is preg_replace() the proper function for this?

Cheers

1 Answer 1

2

You should use internals as opposed to REGEX whenever possible, because often the authors of those functions have considered edge cases (or read the REALLY long RFC for URLs that details all of the cases). For you case, I would use parse_url() and then http_build_url() (note that the latter function needs PECL HTTP, which can be installed by following the docs page for the http package):

$href = 'http://www.domain.com/MainPage.html/SubPage.html';
$parts = parse_url($href);

if($parts['host'] == 'www.domain.com') {
    $parts['host'] = 'www.yoursite.com';

    $href = http_build_url($parts);
}

echo $href; // 'http://www.yoursite.com/MainPage.html/SubPage.html';

Example using your code:

foreach ($doc['a'] as $link) {
    $urlParts = parse_url(pq($link)->attr('href'));               

    $urlParts['host'] = 'www.yoursite.com'; // This replaces the domain if there is one, otherwise it prepends your domain

    $newURL = http_build_url($urlParts);

    pq($link)->attr('href', $newURL);
}
Sign up to request clarification or add additional context in comments.

15 Comments

Actually I just thought of something. My custom url isn't static, i.e it will depend on user input and be stored in a variable. Will preg_replace be able to take a url stored in a variable, compare it with another url and replace the matching url with my own url?
It doesn't need to be static to work with this. You can use this with your foreach loop. Let me reiterate that I would recommend against using preg_replace().
I just carefully re-read your answer and wow that is really what I need! haha sorry my bad, I must be too tired from too much coding. I'll try out the method asap now and get back as soon as I can :)
I'm trying to get the PECL HTTP extension and on the php manual site it only explains how to install it for windows. I'm using a mac and I read here stackoverflow.com/questions/5536195/… that I should download and install PEAR? I've never installed any php extensions before, would you have any suggestions how I could get the PECL HTTP on mac?
Have you checked out: pear.php.net/manual/en/… ?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.