3

I have this function that gets the html from a list of pages and once I run it for two hours or so the script interrupts and shows that memory limit has been exceeded, Now i've tried to unset/set to null some variables hopefully to free up some memory but it's the same problem. Can you guys please take a look at the following piece of code? :

{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    if ($proxystatus == 'on'){
        curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
        curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
        curl_setopt($ch, CURLOPT_PROXY, $proxy);
    }
    curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
    curl_setopt($ch, CURLOPT_URL, $site);
    ob_start();
    return curl_exec($ch); // the line the script interrupts because of memory
    ob_end_clean();
    curl_close($ch);

    ob_flush();
    $site = null;
    $ch = null;

}

Any suggestion is highly appreciated. I've set the memory limit to 128M, but before increasing it (doesnt seem like the best option to me) I would like to know if there's anything I can do to use less memory/free up memory while running the script.

Thank you.

2
  • Did you code this as a method of a class and are you running it via CLI? Commented Feb 4, 2013 at 13:03
  • Nope it's a function, that loops thru a list of urls and fetche the html. Yes Im using it via command line. Commented Feb 4, 2013 at 13:17

3 Answers 3

2

I know it's been a while, but others might run into a similar issue, so in case it helps anyone else... To me the problem here is that curl is set to save the output to a string. [That's what happens with curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);] If the output gets too long, the script will run out of allowed memory for that string. [That returns an error like FATAL ERROR: Allowed memory size of 134217728 bytes exhausted (tried to allocate 130027520 bytes)] The way around this is to use one of the other output methods offered by curl: output to standard output, or output to file. In either case, ob-start shouldn't be needed at all.

Hence you could replace the content of the braces with either option below:

OPTION 1: Output to standard output:

$ch = curl_init();
if ($proxystatus == 'on'){
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
    curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
    curl_setopt($ch, CURLOPT_PROXY, $proxy);
}
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_URL, $site);
curl_exec($ch);
curl_close($ch);

OPTION 2: Output to file:

$file = fopen("path_to_file", "w"); //place this outside the braces if you want to output the content of all iterations to the same file
$ch = curl_init();
if ($proxystatus == 'on'){
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
    curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
    curl_setopt($ch, CURLOPT_PROXY, $proxy);
}
curl_setopt($curl, CURLOPT_FILE, $file);    
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_URL, $site);
curl_exec($ch);
curl_close($ch);
fclose($file);  //place this outside of the braces if you want to output the content of all iterations to the same file
Sign up to request clarification or add additional context in comments.

1 Comment

Wow, I did not know that CURL can write files. Amazing, thank you. Very helpful.
1

You are indeed leaking memory. Remember that return immediately ends execution of the current function, so all your cleanup (most importantly ob_end_clean() and curl_close()) is never called.

return should be the very last thing the function does.

4 Comments

Thank you. But will there be any returned value if i more up curl_close() and ob_end_clean() ??
Well, you would have to save curl_exec's return value to a variable, then run your cleanup, then return the variable.
Im new on curl but would that make a difference? We are assigning the returned value to a variable and clearing that returned value and then returning the same value that's stored in the variable. Anyways I did so and I dont see any change... :(
Thanks, I used xdebug to track the memory issue and everything seems fine right now. Thank you alot.
0

For sure this is not a cURL issue. Use tools like xdebug to detect which part of your script is consuming memory.

Btw I would also change it not to run for two hours, I will move it to a cronjob that runs everyminute, check what it needs and then stops.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.