2

I am trying to get the page contents from the remote site. It is working for many sites. But some of the urls like http://www1.macys.com/ returns nothing. Can anyone please tell me the solution or what the problem is? Am I miss anything?

If I am using fopen() or file_get_contents() it shows the warning "Redirection limit reached, aborting"

Below is my code.

<?php
    $url = 'http://www1.macys.com/shop/product/volcom-stripe-thermal-shirt?ID=1155481&CategoryID=30423#fn=sp%3D1%26spc%3D996%26ruleId%3D27%26slotId%3D1';

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; rv:19.0) Gecko/20100101 Firefox/19.0');
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_TIMEOUT, 5);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);

    $contents = curl_exec($ch);

    if(curl_errno($ch)) {
        echo 'Error: ' . curl_error($ch) . '<br><br>';
    }

    echo 'Contents: '; print_r($contents); echo '<br><br>';
    curl_close($ch);
?>
1
  • Not sure what do you mean by trying to get images, if you just open the URL on browser it will show you HTML content it does not look like returning some data like JSON,XML etc to parse and get data Commented Feb 25, 2014 at 11:45

4 Answers 4

2

Some websites won't feed images unless you maintain a cookie jar.

Try this: (from: https://stackoverflow.com/a/12885587/2167896)

$jar = tmpfile();
$output = fetch('www.google.com', $jar)
function fetch( $url, $z=null ) {
            $ch =  curl_init();

            $useragent = isset($z['useragent']) ? $z['useragent'] : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2';

            curl_setopt( $ch, CURLOPT_URL, $url );
            curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
            curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
            curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
            curl_setopt( $ch, CURLOPT_POST, isset($z['post']) );

            if( isset($z['post']) )         curl_setopt( $ch, CURLOPT_POSTFIELDS, $z['post'] );
            if( isset($z['refer']) )        curl_setopt( $ch, CURLOPT_REFERER, $z['refer'] );

            curl_setopt( $ch, CURLOPT_USERAGENT, $useragent );
            curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, ( isset($z['timeout']) ? $z['timeout'] : 5 ) );
            curl_setopt( $ch, CURLOPT_COOKIEJAR,  $z['cookiefile'] );
            curl_setopt( $ch, CURLOPT_COOKIEFILE, $z['cookiefile'] );

            $result = curl_exec( $ch );
            curl_close( $ch );
            return $result;
    }
Sign up to request clarification or add additional context in comments.

Comments

2

maybe it's a redirect issue.. try to add this:

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

this options let cUrl follows the redirects

edit:

Add also this:

curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__).DIRECTORY_SEPERATOR.'cookie.txt');

Remember to set permissions of cookie.txt to 777

2 Comments

When I am using "curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);" I got the error "Maximum (20) redirects followed"
add this line: curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(FILE).DIRECTORY_SEPERATOR.'cookie.txt'); and remember to set the cookie.txt to chmod 777
0

If code works with other URLs, then it can happen that specific server is blocking your curl requests. Try fopen().

Or add appropriate headers and referer, this is what I have used:

    $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
    $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
    $header[] = "Cache-Control: max-age=0";
    $header[] = "Connection: keep-alive";
    $header[] = "Keep-Alive: 300";
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    $header[] = "Accept-Language: en-us,en;q=0.5";
    $header[] = "Pragma: "; //browsers keep this blank.
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
    curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com');
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3');
    curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
    $contents = curl_exec($ch);

2 Comments

If I am using fopen() or file_get_contents() it shows the warning "Redirection limit reached, aborting"
You can try adding headers and referer. I have edited my answer so you can check example.
0

try to add 'USERAGENT' which is your api username, website name or something else:

curl_setopt($ch, CURLOPT_USERAGENT, 'MY-NAME');

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.