0

The following script will remove duplicates from an array based on a single key. I found it via the following reference. Reference: remove duplicates from array (array unic by key)

The problem I have it that the $initial_data array may contain items with the same [Post_Date] values but different [Item_Title] values.

Is there a way to modify the code such that it only removes duplicates if both the [Post_Date] and [Item_Title] values are identicle?

 // Remove Duplicates based on 'Post_Date'
    $_data = array();
    foreach ($initial_data as $v) {
      if (isset($_data[$v['Post_Date']])) {
        continue;
      } 
      $_data[$v['Post_Date']] = $v;
    }
    // if you need a zero-based array, otherwise work with $_data
       $unique_results = array_values($_data);

Below is a simplified output of the arrays showing 4 fields. The original arrays contain 16 fields.

$initial_data: Original Data Array. The [Post_Date] values are the same but the [Item_Title] values are different.

Array
(
    [0] => Array
        (
            [id] => 22000
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Painting
        )

    [1] => Array
        (
            [id] => 22102
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Repair

        )
...
)

$_data: The $_data array from within the script

Array
(
    [1356373690] => Array
        (
            [id] => 22000
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Painting
        )

    [1356373690] => Array
        (
            [id] => 22102
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Repair

        )
...
)

$unique_results: The final unique results array. As you can see the duplicate array item was removed by the script based on the [Post_Date] alone, but I need it to also evaluate if the [Item_Title] values are different or identical so that it will not consider this array item a duplicate.

Array
(
    [0] => Array
        (
            [id] => 22000
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Painting
        )
...
)
1
  • @Mike Brant I have tried nothing else. The code above works well for 1 key but I need it modified for 2 keys. I also tried the two suggestions below and both did not work. Commented Dec 25, 2012 at 0:16

2 Answers 2

1

The easiest way, I suppose, is using simple concatenation of these two properties as a key for this $data hash:

$key = $v['Post_Date'] . $v['Item_Title'];
if (isset($_data[$key])) {
  continue;
} 
$_data[$key] = $v;

It obviously won't work if Post_Date and Item_Title can 'overlap' - but it seems not to be possible from the given sample. To prevent this, you can insert a separator symbol in that $key, like this:

$key = $v['Post_Date'] . ':' . $v['Item_Title'];

... as colon symbol obviously won't be used to store a timestamp string.

Sign up to request clarification or add additional context in comments.

Comments

0

You could solve this with nested loop

$uniqueData = array();
foreach ($initialData as $item) {
    $exists = false;

    // check if same item was already added to uniqueData array
    foreach ($uniqueData as $uniqueItem)
        if($item['postDate'] == $uniqueItem['postDate'] && $item['itemTitle'] == $uniqueItem['itemTitle'])
            $exists = true;

    // there is no same item in uniqueData array
    if(!$exists)
        $uniqueData[] = $item;
}

print_r($uniqueData);

As a side note, in most cases it's best to avoid using continue statement, as it will make your code harder to read.

4 Comments

This is inefficient, to say the least. First, you skip the possibility of using hash function for the fast lookup of existing items. Second, you didn't break the loop when the item is found (so each search will go through the whole $uniqueData array again and again). Finally, your statement on continue is... weird, to say the least: there's nothing wrong or 'unreadable' with this op by default, it all depends on how it's used.
@kustrle Itried it and it took about 30 seconds to process ending up with an empty array Array(). I substituted your variables $initialData with my array $results, your postDate with my Post_Date, and your itemTitle with my Item_Title in your code.
I ran it with 4 items and it works fine and without delay. @raina77ow He didn't ask for efficient solution. Premature optimization is the root of all evil. If he doesn't have lots of items the code will run just fine. About continue statement I can use your answer as an example. Writing if (!isset($_data[$key])) $_data[$key] = $v; looks much more cleaner than with continue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.