1

I am trying to get HTML source from URL using curl.

The below code works perfectly in localhost but it does not return anything when moved to server:

function get_html_from_url($url) {
$options = array(
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_HEADER         => false,   
        CURLOPT_FOLLOWLOCATION => false,   
        CURLOPT_ENCODING       => "",      
        CURLOPT_USERAGENT      => "User-agent: Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420.1 (KHTML, like Gecko) Version/3.0 Mobile/3B48b Safari/419.3", 
        CURLOPT_AUTOREFERER    => true,     
        CURLOPT_CONNECTTIMEOUT => 30,      
        CURLOPT_HTTPHEADER     => array(
            "Host: host.com",
            "Upgrade-Insecure-Requests: 1",
            "User-Agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Mobile Safari/537.36",
            "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
            "Accept-Encoding: gzip, deflate",
            "Accept-Language: en-US,en;q=0.9",
            "Cookie: JSESSIONID=SESSSIONID",
            "Connection: close"
        ),
        CURLOPT_TIMEOUT        => 30,     
        CURLOPT_MAXREDIRS      => 10,     
        CURLOPT_SSL_VERIFYPEER => false,  
    );
    $ch      = curl_init( $url );
    curl_setopt_array( $ch, $options );
    $content = curl_exec( $ch );
    $err     = curl_errno( $ch );
    $errmsg  = curl_error( $ch );
    $header  = curl_getinfo( $ch );
    curl_close( $ch );

    $header['errno']   = $err;  
    $header['errmsg']  = $errmsg;
    $header['content'] = $content;
    return $header;
}

I get timeout error on server and I even tried to increase the timeout but no luck!

Thanks.

5
  • Can you show us the Full and Complete Error message please Commented Sep 24, 2018 at 10:10
  • 4
    Could you reach remote server from yours (e.g. ping, etc.)? Commented Sep 24, 2018 at 10:11
  • I am able to reach to remote server in localhost, but when I move my code to real server It does not work! What are the possible issues are there. If you can list I will definitely try to resolve it from each listed. Commented Sep 24, 2018 at 13:11
  • fabrik sais: try to reach the remote server from your server using ping instead of curl – does this fail too? Commented Sep 24, 2018 at 13:17
  • Does ping can bring html content for me with the desired url Commented Sep 24, 2018 at 13:25

3 Answers 3

2

You could run a test using file_get_contents() like so:

$url = file_get_contents('http://example.com');
echo $url; 

But using Curl is the way to go. I'd check what network access you have from the server?

Sign up to request clarification or add additional context in comments.

1 Comment

No it seems not working for the desired url, Thanks anyway
1

Here is a sample code which fetches remote url data & store inside a file. Hope it'll help you.

function scrapper()
{
    $url = "https://www.google.com/";

    $curl = curl_init();

    curl_setopt_array($curl, array(
        CURLOPT_RETURNTRANSFER => 1,
        CURLOPT_URL => $url
    ));

    $response = curl_exec($curl);

    return $response;
}

$scrap_data = scrapper();

$myfile = fopen("scrap_data.txt", "w") or die("Unable to open file!");
fwrite($myfile, $scrap_data);
fclose($myfile);

echo "Scrapped data saved inside file";

4 Comments

Hi @Suresh, my script is working fine in localhost but not in server, this is the issue I have, Thanks
I get connection timed out, and when I increase timeout, the issue still persists
Thanks @suresh code is working fine in the server but not in local
@SurajRathod use following bit code in top of your PHP script & execute. This will helpe you to show what exact error you are getting at the time of execution. error_reporting(E_ALL);set_time_limit(0);ini_set('display_errors', '1');ini_set('memory_limit', '-1');
1

If I correctly understood your requirement, the following script should get you there. There is a function you can make use of htmlspecialchars() to get the desired output.

<?php
function get_content($url) {
    $options = array(
            CURLOPT_RETURNTRANSFER => 1, 
            CURLOPT_USERAGENT      => "Mozilla/5.0",         
    );
    $ch      = curl_init( $url );
    curl_setopt_array( $ch, $options );
    $htmlContent = curl_exec( $ch );
    curl_close( $ch );
    return $htmlContent;
}
$link = "https://stackoverflow.com/questions/52477020/get-html-from-a-url-using-curl-in-php"; 
$response = get_content($link);
echo htmlspecialchars($response);
?>

The link I've used within the script is just a placeholder. Feel free to replace that with the one you are after.

1 Comment

Hi @SIM, I am able to successfully fetch html in localhost but not in server, is it because of IP address and if it does So how can we resolve this!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.