PHP foreach performance when adding string to variable

Question

Have a foreach as follows:

        foreach($data as $r=>$d)
          {
            $return = $return. "<tr>
            <td>
            ".$d["client_id"]."
            </td>
        ......
            <td>
                ".$d["date_stamp"]."

            </td>
            </tr>";
            }
          }

this takes on my data more than 2seconds! to process, however if I do the following:

      foreach($data as $r=>$d)
        {
          $now= "<tr>
          <td>
          ".$d["client_id"]."
          </td>
      ......
          <td>
              ".$d["date_stamp"]."

          </td>
          </tr>";
          $return = $return.$now;
        }

it takes only 0.2seconds..

Well,ok, you say "fine, use the second approach", sure I will do, however it is a mystery for me WHY such a great performance difference between the two approaches? Any ideas wellcome..thanks

adding a test case:

   //////////function to get time
    function parsemicrotime(){
       list($usec, $sec) = explode(" ",microtime());
       return ((float)$usec + (float)$sec);
       }

    ////////////define test array
    $a = array();
    for($i = 0; $i < 5000; $i ++)//generate 5k rows
      {
        for($k=0; $k<5;$k++)//lets have just 6 columns
          {
            $a[$i]["column_".$k] = 'test string '.$i.' / '.$k.' - note that the size of the $output string makes a huge difference ';
          }
      }

    ///////////////first test
    $time_start = parsemicrotime();
        $output = '';
        foreach($a as $row=>$columns)
          {
            $output = $output ."
            <tr>
              <td>".$columns["test_0"]. "</td>
              <td>" .$columns["test_1"]. "</td>
              <td>" .$columns["test_2"]. "</td>
              <td>" .$columns["test_3"]. "</td>
              <td>" .$columns["test_4"]. "</td>
              <td>" .$columns["test_5"]. "</td>
            </tr>";
          }
    $approach_1_result = parsemicrotime()-$time_start;


    /////////////second test
    $time_start2 = parsemicrotime();
        $output2 = '';
        foreach($a as $row2=>$columns2)
          {
            $now2= "
            <tr>
              <td>".$columns["test_0"]. "</td>
              <td>" .$columns["test_1"]. "</td>
              <td>" .$columns["test_2"]. "</td>
              <td>" .$columns["test_3"]. "</td>
              <td>" .$columns["test_4"]. "</td>
              <td>" .$columns["test_5"]. "</td>
            </tr>";
            $output2 = $output2 .$now2;
          }
    $approach_2_result = parsemicrotime()-$time_start2;


    /////////////third test
    $time_start3 = parsemicrotime();
    ob_start();
        $output3 = '';
        foreach($a as $row3=>$columns3)
          {
            echo "
            <tr>
              <td>".$columns["test_0"]. "</td>
              <td>" .$columns["test_1"]. "</td>
              <td>" .$columns["test_2"]. "</td>
              <td>" .$columns["test_3"]. "</td>
              <td>" .$columns["test_4"]. "</td>
              <td>" .$columns["test_5"]. "</td>
            </tr>";
          }
    $output3 = ob_get_clean();
    $approach_3_result = parsemicrotime()-$time_start3;


    die("first test:".$approach_1_result."<br>second test:".$approach_2_result."<br>third test:".$approach_3_result);

Is this something you experienced more than once or just a one time situation? Like, CPU busy for something else, slow network, or whatever — Damien Pirsy
– Damien Pirsy, Commented Oct 13, 2011 at 13:49
How many iterations are there? However: sounds strange, anyway ... — aurora
– aurora, Commented Oct 13, 2011 at 14:01
yes, it is two dimensional array, time measured right before the foreach and right after that, tried several times with both aproaches, results +- the same all the time..using echo is even quicker, but the difference between echo and $return=$return.$new; is insignificant (and adding ob_start() etc might even slow a little bit down) — user268651
– user268651, Commented Oct 13, 2011 at 14:39

connec · Accepted Answer · 2011-10-13 17:53:55Z

2

I did a few similar experiments using a generated array:

$a = [];
for($i = 0; $i < 10000; $i ++)
  $a[] = $i;

To time them I simply store the microtime before the execution and subtract it from the microtime after the execution. I executed the code 10 times and took the average.

First I tried something similar to your first approach:

$output = '';
foreach($a as $k => $v)
  $output = $output . "some static text" . $v . "some other text";

This recorded an insane time of ~3s! I then tried the same code using single quotes and got the same result.

I then changed the concatenation line to:

$output .= 'some static text' . $v . 'some other text';

This resulted in a time of ~0.007s, ~429 times faster!

Finally I changed the code to:

$output = '';
ob_start();
foreach($a as $k => $v)
  echo 'some static text' . $v . 'some other text';
$output = ob_get_clean();

And scored marginally slower than the .= approach (still ~0.007).

Disclaimer: everything that follows are just my intuitions on why the times are what they are.

Now, I'm not an expert on the PHP internals but I'd guess that the reason the first method is so much slower is because it has to create a new string and copy the old one (which is slowly making its way to the final size of ~350,000 characters) 10,000 times, and copying is typically a fairly inefficient operation. The .= approach, however, simply extends the original string, avoiding the copy operations. The buffered approach is similar probably because writing to the output stream is similar in cost to extending a variable, with the ob_start and ob_get_clean adding the marginal overhead.

answered Oct 13, 2011 at 17:53

connec

7,4713 gold badges26 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

martinstoeckli Over a year ago

I'm not an expert either, but i had a look at the implementation of PHP 5.3. It looks to me, that both operators result in the same call: case ZEND_CONCAT: case ZEND_ASSIGN_CONCAT: return (binary_op_type) concat_function; This function can either do a emalloc, or a erealloc if result and first parameter are the same. That leads to my question: with which PHP version did you make your test, and could you try to fill your array with much longer textes? This would prohibit, that a memory block can be reused in erealloc.

connec Over a year ago

This is with 5.4beta1 (I can't resist the short array syntax). I'll rerun the tests with longer texts later today and post the results.

martinstoeckli · Accepted Answer · 2011-10-13 15:57:42Z

There is a difference between the two approaches, but it's astonishing, that is should make such a big difference.

Every time you concatenate two strings, PHP has to allocate a new memory block, big enough for the new string. For really large strings it has to find an even larger block of contiguous memory. Because every time it has to be a bit larger, it cannot reuse the former blocks (they are to small). So if your string grows, it can become slow to find another memory block and copy the string.

In your first example you have a lot of . operations in a single loop. With every . the already large string becomes larger.
In your second example you collect all .operations of the loop. The variable $now will be relatively small and therefore these concatenations are fast. Only once per loop you need to find a big memory block.

As already mentioned, i'm a bit surprised, that it should make such a big difference, but depending on the number of iterations it could be possible.

Collectives™ on Stack Overflow

PHP foreach performance when adding string to variable

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related