2

I just conducted this very interesting experiment and the results came out quite surprising.

The purpose of the test was to determine the best way, performance-wise, to get an element of an array. The reason is that I have a configuration class which holds settings in an associative mufti-dimensional array and I was not quite sure I was getting these values the best way possible.

Data (it is not really needed for the question, but I just decided to include it so you see it is quite a reasonable amount of data to run tests with)

$data = array( 'system' => [
    'GUI' =>
    array(
        'enabled' => true,
        'password' => '',
    ),
    'Constants' => array(
        'URL_QUERYSTRING' => true,
        'ERRORS_TO_EXCEPTIONS' => true,
        'DEBUG_MODE' => true,
    ),
    'Benchmark' =>
    array(
        'enabled' => false,
    ),
    'Translations' =>
    array(
        'db_connection' => 'Default',
        'table_name' =>
        array(
            'languages' => 'languages',
            'translations' => 'translations',
        ),
    ),
    'Spam' =>
    array(
        'honeypot_names' =>
        array(
            0 => 'name1',
            1 => 'name2',
            2 => 'name3',
            3 => 'name4',
            4 => 'name5',
            5 => 'name6',
        ),
    ),
    'Response' =>
    array(
        'supported' =>
        array(
            0 => 'text/html',
            1 => 'application/json',
        ),
    ),]
);

Methods

function plain($file, $setting, $key, $sub){
    global $data;

    return $data[$file][$setting][$key][$sub];
}

function loop(...$args){
    global $data;

    $value = $data[array_shift($args)];
    foreach($args as $arg){
        $value = $value[$arg];
    }

    return $value;
}

function arr(){
    global $data;

    return $data;
}

Parameters (when calling the functions)

loop('system', 'Translations', 'table_name', 'languages');
plain('system', 'Translations', 'table_name', 'languages');
arr()['system']['Translations']['table_name']['languages'];

Leaving aside any other possible flaws and focusing on performance only, I ran 50 tests with 10000 loops. Each function has been called 500000 times in total. The results are in average seconds per 10000 loops:

loop: 100% - 0.0381 sec. Returns: languages

plain: 38% - 0.0146 sec. Returns: languages

arr: 23% - 0.0088 sec. Returns: languages

I was expecting loop to be quite slow because there is logic inside, but looking at the results of the other two I was pretty surprised. I was expecting plain to be the fastest because I'm returning an element from the array and for the opposite reason arr to be the slowest because it returns the whole array.

Given the outcome of the experiment I have 2 questions.

  • Why is arr almost 2 times faster than plain?
  • Are there any other methods I have missed that can outperform arr?
8
  • 4
    Your expections are wrong, why wouldn't it be faster to just return the array, then to actually iterate over the entire array to find a certain value by key? Even when you use brackets and supply the key, iteration has to happen iternally to find that key. Commented Mar 30, 2015 at 13:30
  • @adeneo in the case of plain it is not iterating the array, furthermore when I call arr and then provide keys to the element I want to get to, I'm actually doing the same thing as plain does internally, so how come the results, is what I'm wondering. Commented Mar 30, 2015 at 13:31
  • All you need to do to return an array is to return a reference. If you're returning a single element, you could be returning a reference also, but it has to do a indirect lookup into the array before returning Commented Mar 30, 2015 at 13:34
  • @php_nub_qq Can't reproduce the same results as you got! The loop is pretty much the same with my results, but plain and arr are both +/- at 0.014 for me. What system do you have ? php version? Commented Mar 30, 2015 at 13:34
  • @Rizier123 You can check it out here, just don't forget to switch to 5.6. Results are not quite as dramatic as on my machine but arr is still faster than plain every time. Commented Mar 30, 2015 at 13:40

2 Answers 2

2

I said this in the comment, but I decided it's pretty close to an answer. Your question basically boils down to why is 2+2; not faster than just plain 2;

Arrays are just objects stored in memory. To return an object from a function, you return a memory address (32 or 64 bit unsigned integer), which implies nothing more than pushing a single integer onto the stack.

In the case of returning an index of an array, that index really just represents an offset from the base address of the array, so everytime you see a square bracket-style array access, internally PHP (rather the internal C implementation of PHP) is converting the 'index' in the array into an integer that it adds to the memory address of the array to get the memory address of the stored value at that index.

So when you see this kind of code:

return $data[$file][$setting][$key][$sub];

That says:

Find me the address of $data. Then calculate the offset that the string stored in $file is (which involves looking up what $file is in memory). Then do the same for $setting, $key, and $sub. Finally, add all of those offsets together to get the address (in the case of objects) or the value (in the case of native data types) to push on to the stack as a return value.

It should be no surprise then that returning a simple array is quicker.

Sign up to request clarification or add additional context in comments.

2 Comments

This makes perfect sense. If I switch the values in the test function calling arr to variables instead of literals, it becomes slower as expected. In the end my expectations weren't completely wrong, I guess :D
1

That's the way PHP works. You expect, that a copy of $data is returned here. It is not.

What you acutaly have, is a pointer (Something like an ID to a specific place in the memory), which reference the data.

What you return, is the reference to the data, not the data them self.

In the plain method, you search for the value first. This cost time. Take a look at this great article, which show, how arrays are working internal.


Sometimes Code say's more then words. What you assume is:

function arr(){
    global $data;
    //create a copy
    $newData = $data;
    //return reference to $newData
    return $newData;
}

Also you should not use global, that is bad practise. You can give your array as a parameter.

//give a copy of the data, slow
function arr($data) {
    return $data;
}

//give the reference, fast
function arr(&$data) {
    return $data;
}

1 Comment

This seems like a valid answer, I suppose the only reason for a down vote would be if it is untrue. I will definitely be reading the article later on tonight!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.