EDIT: Thanks to nice_dev for providing an excellent solution below!
I had to rewrite part of his query for correct results:
From:
$tmpmatch['unmatched'] = array_filter($deck['main_deck'],
function ($val) use (&$tmparray, &$matched) {
$matched[ $val ] = $matched [$val] ?? 0;
$matched[ $val ]++;
return $matched[ $val ] <= $tmparray[ $val ];
}
);
To:
$tmpmatch['unmatched'] = array_filter($deck['main_deck'],
function ($val) use (&$tmparray) {
return empty($tmparray[$val]) || !$tmparray[$val]++;
}
);
Original:
I'm attempting to compare arrays to get a new array of matched and unmatched items. I have a loop that goes through roughly 55,000 items. The processing of this script can take upwards of 20+ minutes to attempt to complete and I've narrowed it down to both usage of array_intersect and array_filter within the foreach. Ideally, I need it to complete much faster. If I limit the foreach to 1000 items it still takes upwards of ~3 minutes to complete which is slow for the client-side experience.
If I remove them, the script completes almost immediately. As such, I will include only these relevant pieces in the code below.
I'm using a custom array_intersect_fixed function as regular array_intersect returned wrong results with duplicate values as per here.
Explanations:
totalidsarray = An array of numbers such as ['11233', '35353, '3432433', '123323']. Could contain thousands of items.
$deck['main_deck'] = An array of numbers to compare against $totalidsarray. Similar structure. Max length is 60 items.
foreach($dbdeckarray as $deck){
$tmparray = $totalidsarray;
//Get an array of unmatched cards between the deck and the collection
//Collection = $tmparray
$tmpmatch['unmatched'] = array_filter($deck['main_deck'],
function ($val) use (&$tmparray) {
$key = array_search($val, $tmparray);
if ( $key === false ) return true;
unset($tmparray[$key]);
return false;
}
);
//Get an array of matched cards between the deck and the collection
$tmpmatch['matched'] = array_intersect_fixed($deck['main_deck'], $totalidsarray);
//Push results to matcharray
$matcharray[] = $tmpmatch;
}
//Built in array_intersect function returns wrong result when input arrays have duplicate values.
function array_intersect_fixed($array1, $array2) {
$result = array();
foreach ($array1 as $val) {
if (($key = array_search($val, $array2, FALSE))!==false) {
$result[] = $val;
unset($array2[$key]);
}
}
return $result;
}
To make matters worse, I have to do 2 further matched/unmatched checks within that same foreach loop against another array extra_deck, further increasing processing time.
Is there a more optimized approach I can take to this?
EDIT: Explanation of what the code needs to achieve.
The script will retrieve the user's card collection of cards that they own from a card game. This is assigned into totalidsarray. It will then query every deck in the database (~55,000) and compare the collection you own against the built deck of cards (main_deck). It then attempted to extract all owned cards (matched) and all un-owned cards (unmatched) into two arrays. Once the full foreach loop is done, the client-side returns a list of each deck alongside the matched cards/unmatched cards of each (with a % match for each).
in_array(), which can introduce performance issues at really low numbers (hundreds). So I seearray_search()in both functions and wonder if there's just too much effort, leaning on this function, and not enough setup (like flipping values as keys, which resolvesin_array()performance issues by having a reference lookup, which is fast).json_decode()is also a code smell here, as others point out.json_decode($deck['main_deck']inside your loop (and not even just once, but in two places), why aren't you doing this once, before?