0

I need to count duplicated multidimensional array, remove this duplicates and push counted duplicated in new index.

Suppose a I have this array:

Array
(
[0] => Array
    (
        [segments] => Array
            (
                [1] => Gcia de Auditoría Interna
                [0] => Auditoria Interna 1
            )

        [groups] => Array
            (
                [estados] => sp
                [cidade] => sumpaulo
            )

    )

[1] => Array
    (
        [segments] => Array
            (
                [2] => Gerencia Recursos Humanos
                [1] => Gcia Dpto Admin de Pers. y Rel. Laboral
                [0] => SubGcia Administración de Personal
            )

        [groups] => Array
            (
                [estados] => sp
                [cidade] => 
            )

    )

[2] => Array
    (
        [segments] => Array
            (
                [2] => Gerencia Recursos Humanos
                [1] => Gcia Dpto Admin de Pers. y Rel. Laboral
                [0] => SubGcia Administración de Personal
            )

        [groups] => Array
            (
                [estados] => sp
                [cidade] => 
            )

    )


 )

I want to remove duplicate array and create a new index count:

Array
(
[0] => Array
    (
        [segments] => Array
            (
                [1] => Gcia de Auditoría Interna
                [0] => Auditoria Interna 1
            )

        [groups] => Array
            (
                [estados] => sp
                [cidade] => sumpaulo
            )
        [total] = 1

    )

[1] => Array
    (
        [segments] => Array
            (
                [2] => Gerencia Recursos Humanos
                [1] => Gcia Dpto Admin de Pers. y Rel. Laboral
                [0] => SubGcia Administración de Personal
            )

        [groups] => Array
            (
                [estados] => sp
                [cidade] => 
            )
         [total] = 2

    )

 )

Is it possible?

1
  • It's of course possible with stacked foreachs... Commented Jun 19, 2013 at 3:55

3 Answers 3

1

This seems really ugly, but works.

Stacked foreach version:

http://3v4l.org/Dve0M

$rst=array();
foreach($arr as $ele)
{
    foreach($rst as $i=>$candidate)
    {
        $key=null;
        foreach($ele as $k=>$subarr)
        {
            if(isset($candidate[$k]) && $candidate[$k]==$subarr)
            {
                $key=$i;
                break;
            }
        }
        if(!empty($key))
        {
            break;
        }
    }
    if(!empty($key)) $rst[$key]["total"]+=1;
    else $rst[]=array_merge($ele,array("total"=>1));
}
print_r($rst);

No foreach version:

http://3v4l.org/qUU3a

/* just to ensure the array is sorted.
 * if the array is already pre-sorted,
 * skip this part.
 */
usort($arr,function($a,$b){
    return strcmp(json_encode($a),json_encode($b));
});
$rst=array();
$cache=array();
while($p=array_shift($arr))
{
    if(empty($cache))
    {
        $cache[]=$p;
    }
    elseif($cache[0]==$p)
    {
        $cache[]=$p;
    }
    else
    {
        $rst[]=array_merge($cache[0],array("total"=>count($cache)));
        $cache=array();
        $cache[]=$p;
    }
}
if(!empty($cache))
{
    $rst[]=array_merge($cache[0],array("total"=>count($cache)));
}
print_r($rst);
Sign up to request clarification or add additional context in comments.

1 Comment

Logically, you don't need three nested loops to compare elements of an array.
1

This function works:

function deduplicate($array) {
    foreach($array as $key => $subArray) { // First Part
        for($i = 0; $i < $key; $i++) {
            if (print_r($subArray, true) == @print_r($array[$i], true)) {
                unset($array[$i]);
            }
        }
    }
    $i = 0;                                // Second Part
    foreach($array as $subArray) {
        $newArray[$i] = $subArray;
        $i++;
    }
    return $newArray;
}

Part 1: Line 1 declares the function. Line 2 starts a foreach loop which runs through every element of the array, seeing if it matches any element before it, as checked usng the for loop on line 3, checking with the if statement on line 4. What line 4 actually does is, because you can't just compare the values of arrays to see if they're duplicates, it converts them into strings using print_r. If the strings match, line 5 deletes (unsets) the duplicate element. The @ stops it from giving you errors, because if the second element it is checking has already been deleted, you could get an error. Lines 6, 7 and 8 close the code blocks of the for loop, the foreach loop and the if statement. Now, you have an array without duplicates.

Part 2: Line 9 declares the $i variable, which will be incremented with every run through the foreach loop by the $i++; on line 12. This $i incrementing variable will be the new key for each element of the new array. Line 10 starts a foreach loop, which will loop through the array without duplicates produced by Part 1. Line 11 sets each element of the new array (the reindexed one) to the next element the foreach loop finds in the array from Part 1. Line 12 increments $i, as already mentioned. Line 13 closes the foreach loop's code block. Line 14 returns the new array, and line 15 closes the function. This leaves you with a reindexed version of the array with all duplicate first dimension elements removed.

Now you have a short and elegant way of doing it, and you know exactly how it works. Just copy and paste that at the top of your PHP, and wherever you have an array you need to do this to, just do this:

$array = deduplicate($array);

Comments

0

@Passerby

worked this way too

foreach($csv as $lines){
        $segstring = implode("+", $lines["segments"]);
        $groupstring = implode("+", $lines["groups"]);


        if(!isset($recsv[$segstring."+".$groupstring]["total"])){
            $recsv[$segstring."+".$groupstring] = $lines;
            $recsv[$segstring."+".$groupstring]["total"] = 0;
        }
        $recsv[$segstring."+".$groupstring]["total"]++;

    }

What do you say?

2 Comments

This is great since it uses array key to determine duplicate, which might be faster; but then you're using implodes to construct the key, which I'm not sure if it may be an impact. Anyway, your code looks cleaner (if you can ensure that ["segments"] and ["groups"] are both one-level array; I tried to avoid touching that part to make it more "universal").
@Passerby Thank you for your attention! I actually prefer your solution!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.