My question is really simple, is cURL's curl_multi_init actually multi threaded or does it just use an asynchronous API? Thanks!
-
The question may be simple, but why do you ask that question? It really doesn't matter to the user how it is implemented. What matters is the API that it provides...Daniel Stenberg– Daniel Stenberg2015-06-07 15:44:57 +00:00Commented Jun 7, 2015 at 15:44
-
I think it's safe to assume that everybody who posts a question on here likely thinks that they have a real reason to do it. So its also safe to assume that it matters to the person asking the question. Thanks for your input though.Josh C– Josh C2015-06-11 20:09:12 +00:00Commented Jun 11, 2015 at 20:09
-
1Yes, but with more understanding there will be better answers...Daniel Stenberg– Daniel Stenberg2015-06-12 06:14:57 +00:00Commented Jun 12, 2015 at 6:14
-
Oh, you mean you just wanted a better understanding of why I needed to know this. That way you could give a better answer. okay I get it now.Josh C– Josh C2015-06-12 06:54:41 +00:00Commented Jun 12, 2015 at 6:54
2 Answers
cURL is a very old library, and truly asynchronous code is a relatively new concept. libcurl was written in C, so every single request is blocking. Although you can process multiple requests in parallel, this is definitely not asynchronous at all for your program as you have to wait until the longest request is finished:
while((curl_multi_exec($master, $running)) == CURLM_CALL_MULTI_PERFORM);
That said, I still believe cURL is the best solution. Other libraries that implement this functionality often also use curl, they just add callbacks so that data can be processed as soon as each individual request is finished, instead of waiting for them all and then reaching 100% CPU to process everything at the end.
The problem is that most implementations of curl_multi wait for each set of requests to complete before processing them. If there are too many requests to process at once, they usually get broken into groups that are then processed one at a time. The problem with this is that each group has to wait for the slowest request to download. In a group of 100 requests, all it takes is one slow one to delay the processing of 99 others. The larger the number of requests you are dealing with, the more noticeable this latency becomes.
The solution is to process each request as soon as it completes. This eliminates the wasted CPU cycles from busy waiting. I also created a queue of cURL requests to allow for maximum throughput. Each time a request is completed, add a new one from the queue. By dynamically adding and removing links, we keep a constant number of links downloading at all times. This gives us a way to throttle the amount of simultaneous requests we are sending. The result is a faster and more efficient way of processing large quantities of cURL requests in parallel.
RollingCurlX Rolling Curl X is a fork of Rolling Curl wrapper cURL Multi. It aims at making concurrent http requests in PHP as easy as possible.
RollingCurl for PHP (not updated in 5 years)
1 Comment
http://php.net/manual/en/function.curl-multi-init.php
Allows the processing of multiple cURL handles asynchronously.