17

What would be the most efficient way to select every nth item from a large array? Is there a 'smart' way to do it or is looping the only way?

Some points to consider:

  • The array is quite large with 130 000 items
  • I have to select every 205th item
  • The items are not numerically indexed, so for($i = 0; $i <= 130000; $i += 205) won't work

So far, this is the most efficient method I've come up with:

$result = array();
$i = 0;
foreach($source as $value) {

    if($i >= 205) {
        $i = 0;
    }

    if($i == 0) {
        $result[] = $value;
    }

    $i++;
}

Or the same with modulo:

$result = array();
$i = 0;
foreach($source as $value) {
    if($i % 205 == 0) {
        $result[] = $value;
    }
    $i++;
}

These methods can be quite slow, is there any way to improve? Or am I just splitting hairs here?

EDIT

Good answers all around with proper explanations, tried to pick the most fitting as the accepted answer. Thanks!

3
  • That looks reasonable to me - are you sure that code is causing bottlenecks? If not, profile it to see! How long does it take? Commented Dec 21, 2009 at 11:33
  • @Dominic, this is not so much of a bottleneck, just an interesting problem I couldn't find a proper solution for. Don't think a 'correct' answer would shave more than few milliseconds of execution time, but it would be nice to know. :) Commented Dec 21, 2009 at 11:44
  • I don't know if it's faster, but use $chunked = array_chunk($source, $nth_item+1) and then get all the element using array_column($chunked, $nth_item) if only you wanted the values because the keys will be lost. Commented Feb 1, 2024 at 9:03

10 Answers 10

17

A foreach loop provides the fastest iteration over your large array based on comparison testing. I'd stick with something similar to what you have unless somebody wishes to solve the problem with loop unrolling.

This answer should run quicker.

$result = array();
$i = 0;
foreach($source as $value) {
    if ($i++ % 205 == 0) {
        $result[] = $value;
    }
}

I don't have time to test, but you might be able to use a variation of @haim's solution if you first numerically index the array. It's worth trying to see if you can receive any gains over my previous solution:

$result = array();
$source = array_values($source);
$count = count($source);
for($i = 0; $i < $count; $i += 205) {
    $result[] = $source[$i];
}

This would largely depend on how optimized the function array_values is. It could very well perform horribly.

Sign up to request clarification or add additional context in comments.

4 Comments

Great explanation and actually an faster way to accomplish what I tried to do (just slighty, but faster anyways). Thanks!
@Tatu, after a quick comparison test on array_push and [] on a test array of 200,000 items I found [] to run about twice as fast. I modified my answer and I'd suggest using the assignment the way you initially had it.
@Tatu, I quickly benchmarked the two methods above and found the second to be quicker when the starting array is on average larger than 50,000 results. I only had time to test starting with numerically indexed arrays generated by array_fill() and range(), however.
@CoreyBallou this is the best answer. However, Is there any possibility to show row number with the details?
8

Try ArrayIterator::seek()

Also, using one the new Spl datastructures might yield better results than using plain arrays.

1 Comment

Will look into that, only reason I didn't accept this as the correct answer was that I would have to change my codebase significantly to support ArrayObjects efficiently. But noted and plussed.
6

I recommend to using array_slice

$count = count($array) ;
for($i=205;$i<$count;$i+=205){
    $result[] = array_slice($array,$i,1);
}

If your array was numerically indexed, this would be very fast:

$count = count($array) ;
for($i=205;$i<$count;$i+=205){
    $result[] = $array[$i];
}

7 Comments

At least the syntax is a lot clearer, any idea would this actually improve performance?
For loops are slow. Your count() is also ridiculously slow.
Taking the count out of the loop would indeed speed this up substantially.
@cballou, but note that this method only has to do 130000 / 205 loops, whereas foreach would have to loop through all 130 000 items. I have to test it out, I'm worried about array_slice's performance, as that probably does looping from 0 to $i internally.
This is really slow on larger arrays. If you change $result=array_slice(...) to $result[] = $array[$i] it's faster than the accepted solution.
|
2

I think the solution to this problem doesn't lie in any PHP syntax but in the design of your code.

You could either make the array numerically indexed (might not be plausible for your application), keep track of every 205th item, or only search the array once (cache a list of each 205th item).

In my mind, keeping track of each 205th item would be easier to implement. You'd just keep a count of all the items in a database or something, and every time an item is added, check the modulo of the count. If you have another 205th item, add it to the array. For when items are deleted though, this would be trickier. You might have to re-check the entire array to realign all your 205th items.

Doing this would be simpler if you could start at the deleted item and move forward, but again this would only work for numerically indexed arrays -- and if that were true, you wouldn't have to move forward at all, you would just do a little maths to re-figure it out.

  • Numerical indexes -- better long-term solution but harder to implement
  • Keeping track -- easier to implement, but you'd have to get dirty again when you delete items
  • Caching items -- you should probably do this for the other two solutions as well, but on its own, it would be fast until the array were modified, in which case you would probably have to re-do it.

Comments

1

If this really is a bottleneck, you might want to consider rethinking your design in order to make it numerically indexed.

EDIT: Or create and maintain a seperate array with just the 205th items (which gets updated at insert or something like that).

Comments

0
  • Create a two dimentional array [205][N]
  • Load data into array
  • Access 205th element for every N

May sound silly but by definition it is fastest since you access memory locations directly and do not perform any comparisons.

Comments

0

You can't move the array pointer, it seems, more than once at a time. I would personally use this:

reset($source);
$next = true;
while($next === true){
    $result[] = current($source);
    for(i=0;i<205;i++){
        $next = next($source);
    }
}

If someone can find a function that can move the array pointer more than just one step at a time, you will have a better answer. I think this is good though.

6 Comments

You're right, moving the pointer 205 positions at a time would be the solution, otherwise these all are just the same thing with different syntax, I suppose.
See my answer. ArrayIterator::seek can move the pointer to position
What if there are values that do not evaluate to true?
When $next is false, we've reached the end of the array
@Gausie: … or if the next element can not be evaluated to true: “Returns the array value in the next place that's pointed to by the internal array pointer, or FALSE if there are no more elements.”
|
0

This Code can select all the even/odd indexed element from an array

<?php
$a = [1 => 'One', 2 => 'Two', 3 => 'Three', 4 => 'Four', 7 => 'Seven', 8 => 'Eight'];
echo "<br>";
foreach ($a as $key => $value) {

    if ($key % 2 == 0) {
        echo $value . ' ';
    }
}

Comments

0

php 7.4+

$result = array_filter($source, fn(int $key): bool => $key % 205 === 0, ARRAY_FILTER_USE_KEY);

Comments

-1

You could use array_keys to work only on the array keys.

$keys = array_keys($array);
for ($i=0, $n=min(count($keys), 130000); $i<$n; $i += 205) {
    $result[] = $array[$keys[$i]];
}

2 Comments

If the array traversal is a bottleneck, won't the retrieval of all keys be?
@xtofl: Arrays in PHP are no real arrays like in other languages. They are rather implemented with a hash table. And the keys are stored separately, probably in a linked list.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.