0

I have a large set of results as an array from a cakePHP model for a csv export. I have been formatting using a loop as shown below. As the number of records grow, this is becoming too slow and giving time out errors. Is there a better way to do this using either cakephp hash or php array functions?

foreach($people as $person){    
      array_push($results, array(                               
               'SchoolName'=>   $person['School']['name'],
               'SchoolRef'  =>  $person['School']['ref'],
               'firstName' =>  $person['Person']['firstname'],
               'LastName'  =>   $person['Person']['lastname'],
               'Year1'  =>  $person['Person']['year_1'],
               'StudentID'  =>  $person['Person']['studentid'],
               'Email'     =>   $person['Person']['email']
             ));                                                  
    }           
9
  • Why do you think this particular place is slow? Commented Sep 3, 2013 at 5:26
  • because it is a loop in php rather than using a php function in c. Commented Sep 3, 2013 at 5:28
  • 5
    "because it is a loop in php rather than using a php function in c" --- khm, uhm, ghm... It's a loop in php. So what? If you think that the most issue with your php code is that it's not C - then rewrite it in C. Or better - in ASM Commented Sep 3, 2013 at 5:29
  • Now seriously: take a profiler or just microtime(true) and measure every part of your code. Find the slowest part. Then optimize. In this particular order. If you have some issues optimizing the really slow code - ask another question (or change the current one) Commented Sep 3, 2013 at 5:33
  • 2
    "which i suspect will be faster" --- Will, seriously, the performance optimization rule of thumb: optimize what is slow. array_push isn't slow by design. I cannot understand why you don't just want to profile your script and start doing real work. 50k you're working with is really tiny, and presumably you have a problem somewhere else. Now it's important to realize that your current assumption is wrong. Commented Sep 3, 2013 at 8:01

1 Answer 1

1

If you're just outputting to CSV, why not try outputting directly from MySQL (or whichever database you're using).

Eg. http://ariejan.net/2008/11/27/export-csv-directly-from-mysql/

Alternatively, if the data doesn't change, you might be able to presummarize the existing output. So, if you had 10,000 students the last time you outputted the CSV, you could save that CSV and just append the new records. If they do change, you could add a hash of all fields to each record.

Also, if the data doesn't have to be up to the minute accurate, you could presummarize on a daily basis (or whatever interval works for you).

However, without clear indication of where you're at (in terms of record sizes and timeouts), and without a clear idea of where you'd like to be, its difficult to make a specific recommendation.

Sign up to request clarification or add additional context in comments.

4 Comments

thanks. This is constructive. the report is something the admin user can run. its in app, not via phpmysql or something. I like the presummarize idea. i'm on 50k records. so for now, upping the php timeout is probably a simpler solution, but that obviously doesn't scale too well.
@Will: "but that obviously doesn't scale too well" --- of course it doesn't. But for some reason instead of pin pointing the issue - you are continue guessing. Well, it's up to you - if you want to become an advanced guesser or a developer, but 50k doesn't sound like a real performance issue
@Will Also, make sure you have APC turned on. It can make a huge difference. I know its cliche, but sometimes just upgrading the server to a beefier box is a simple (and often cheap) solution. I think zerkms' comments have weight. If you haven't explored xdebug, I'd urge you to try. Lastly, whenever I get really stuck with a performance problem, I offload the work out of PHP into c++ (usually using RabbitMQ to tie them together). That may be beyond your scope, but I figured I'd mention it seeing as we're throwing out all options here. I can help you get set up if you need it. Cheers.
@Will If you're getting into XDebug, I've found the KCacheGrind viewer very useful. Or on windows, sourceforge.net/projects/wincachegrind

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.