I am writing a PHP script which downloads pictures from Internet. As the data is huge the execution time for the script varies from 10-15 minutes. Are there any better ways to handle such a situation or should i simple execute the script and let it take the time it takes?
-
That depends on if you want to monitor it and whether it is user facinguser20232359723568423357842364– user202323597235684233578423642013-07-16 18:59:11 +00:00Commented Jul 16, 2013 at 18:59
-
1Run it from the command line instead? More context would help...No Results Found– No Results Found2013-07-16 18:59:17 +00:00Commented Jul 16, 2013 at 18:59
-
Run it from the command line (it also lets you monitor it), also set the timeout to 0.Dave Chen– Dave Chen2013-07-16 18:59:44 +00:00Commented Jul 16, 2013 at 18:59
-
1What is the actual problem about 15-minute scripts? Does it crash? Does it provide insufficient user feedback? Is it just plain slow? Your question is very situational and does not provide the use-case that needs to be handled.Maxim Kumpan– Maxim Kumpan2013-07-16 19:03:51 +00:00Commented Jul 16, 2013 at 19:03
3 Answers
Your script appears to be essentially I/O bound. Short of getting more bandwidth, there's little you can do.
You can improve user experience (if any) by increasing interactivity. For example you can save the filenames you intend to download in a session, and redisplay the page (and refresh it, or go AJAX) after each one, showing expected completion time, current speed, and percentage of completion.
Basically, the script will save in session the array of URLs, and at each iteration pop some of them and download them, maybe checking the time it takes (if you download one file in half a second, it's worth it to download another).
Since the script is executed several times, not only one, you need not worry about its timeout. You do, however, have to deal with the possibility of the user aborting the whole process.
Comments
I would've recommended multiple threads to do it faster if there are not any bandwidth restrictions. but the closest thing php has is process control.
Alternatively sometime ago I wrote a similar scraper, and to execute it faster I used the exec functions to instantiate multiple threads of the same file. Hence you also need to create a repository and locking mechanism. Sounds and looks dirty, but works!