2

Please save me from myself (or reassure me that I'm not being completely misguided)

I've gotten into the habit of writing code something like the following:

function foo($aUserObject) {
    $theUserUID = $aUserObject->uid;
    $aDeepValue = $aUserObject->property[123][456];
    [more code, in which much use is made of $theUserUID and $aDeepValue]
}

My strategy is probably obvious: I'm taking the attitude that it's going to be easier for the PHP interpreter to handle a variable reference than to continually dig into the object to find the thing I'm interested in, so I should be getting some performance benefit. In addition, my code is perhaps a bit more bug-free and understandable (as long as I remember the meanings of the variable names), since I'm mostly writing simple variable names instead of longer and more complex object/array references where my fingers are more likely to slip. I understand that there's a price for doing this -- there are now two copies of $aUserObject->uid and $aUserObject->property[123][456] floating around, and if those values are large, the additional memory costs could add up. But I'm currently willing to pay that price in exchange for the (alleged) benefits.

Or, that's what I'm telling myself anyway, based on my naive theory of how PHP underpinnings work. But reality, especially when opcode caching tools like APC get introduced, may be a totally different matter. Any more informed opinions out there, that might push me one way or another?

Thanks!

3
  • You might want to do some reading here: php.net/manual/en/features.gc.php Commented Jun 10, 2011 at 23:18
  • I like the question. Normally I create local variables for readability (to avoid long lines), and when I need the value more than one time. But if I just need $aUserObject->uid once, It'd probably feel needless to write an extra line. Commented Jun 10, 2011 at 23:22
  • I agree with @joakimdahlstrom and do the same. In terms of some of the answers below, I would like to hear from someone who actually knows the answer to the OP question about the expense and memory issues to keep this focused. I think it used to be more of an issue and I used to do the same for REQUEST vars, but I don't think it helps much anymore (PHP >5.1), particularly if you are using opcode caching. Commented Jun 10, 2011 at 23:48

4 Answers 4

4

I can reassure you, you're wrong when you say:

I understand that there's a price for doing this -- there are now two copies of $aUserObject->uid and $aUserObject->property[123][456] floating around, and if those values are large, the additional memory costs could add up.

Unless $theUserUID is modified or referenced, it points to the exact same memory location that the property you fetched it from.

You can even do:

$a = $b = $c = $d = $e = 'hello world!';

And it won't take any more memory than:

$a = 'hello world!';

A copy will be created in the following scenarios:

$a = 1;
$b = $a;  // $b references $a
$b = 2;   // $b is now a copy (no longer references $a)

$a = 1;
$b = $a;  // $b references $a
$c = &$b; // $b is now a copy (no longer references $a)

It's called copy-on-write.

Tip: Try debug_zval_dump and memory_get_usage and notice the refcount that increases, while the memory usage stays the same.

Sign up to request clarification or add additional context in comments.

6 Comments

What about functions with pass-by-reference? isn't there always a copy before the (first) function call?
@jcinacio: What do you mean functions with pass-by-reference? A reference will never create a copy. As for passing variables as function arguments, the same copy-on-write mechanism applies there. Unless the variable is modified within the function, it will reference the original. If it is modified, a copy will be created locally (in the function) and destroyed when it goes out of scope (at the end of the function).
@netcoder: a reference obviously doesn't create a copy, but if $b = $a, passing $b by reference means $b must have a copy of the contents from $a, and not the actual same data?
@jcinacio: In that case, yes that's right. If $b = $a, passing $b by reference will create a copy of $a and $b will not reference $a anymore, at all (even outside the function), producing the same effect as copy-on-write. Another reason to be careful with references. ;)
@netcoder - that's what i figured. so the wording should be "Unless $theUserUID is modified or referenced" ... :)
|
1

Correct me if I'm mistaken, but I believe you're wrong about something.

there are now two copies of $aUserObject->uid and $aUserObject->property[123][456] floating around, and if those values are large, the additional memory costs could add up.

There'll be another reference to the value, but it does NOT mean that it will occupy twice the amount of memory. It's like a relational database, you can add lots of references to the same element, but it's only the references themselves that will be stored more than once.

8 Comments

If they were references, perhaps, but he's creating new variables the way he's doing it there.
@ldg: Nope, @joakimdahlstrom is right on this one. It's called copy-on-write.
mmm true, if they are treated as read-only. tx for the clarification.
So, ok, if you don't change the reference the memory usage is the same, but is there any performance advantage to using a copy vs referencing a deep object property, like "$aDeepValue" vs "$aUserObject->property[123][456]" as in the OP question? I'm guessing not but would be interested in the details.
@Idg I don't think the difference is noticeable even if you're doing it all over your code. But I'd like to hear it from an expert as well.
|
0

Code being more understandable is a big plus if you plan on maintaining it.

Of course, some routines might need more attention to memory/performance issues than others, but unless you are dealing with big amounts of data, the benefits are well worth the (possible) costs.

by the way, you can also use references:

$theUserUID = &$aUserObject->property[123][456];
$theUserUID = 'someValue'; // updates $aUserObject 

3 Comments

agreed on understandability being a big plus, though apparently the subject of references is a little contentious, and doesn't guarantee better performance: schlueters.de/blog/archives/125-Do-not-use-PHP-references.html
@user519575: references have big performance gains as the amount of data grows - such as very big arrays - both memory costs, and time spent copying
yes, but like I said it doesn't guarantee better performance. References in PHP are best used for large objects - as you point out - or when a function needs to update an incoming value. Both small objects and nested arrays can have performance costs when passed by reference.
0

This seems good to me. But only as a second point.

I suggest:

  1. Be sure that your code is readable and understandable (documentation, proper variable naming, codestyle guidlines, etc).
  2. Search for modern standards or new one in the future in the languge (here php) and keep to it!
  3. Reusablity: This is the benefit of point 1 and 2.
  4. Performance issues should be discussed - but on an algorithmic/design-pattern layer and not on deep-code-basis. On the final implementation just be sure that you implement it correct.
  5. Optimization: Only if some performance issues arise, than think of deep-code-optimization. But be sure that its understanable and readable. Otherwise the code gets useless in the future.

1 Comment

I hate to disagree that performance shouldn't be thought about on "deep code". there are many simple things that can have a drastic effect on performance (such as passing large arrays by value)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.