1

I have an array, with php I need to remove all NON duplicates on the "listingCode" from this array. For instance:

Array
(
    [0] => Array
    (
        [name] => Supplier A
        [listingCode] => ABC
    )
    [1] => Array
    (
        [name] => Supplier B
        [listingCode] => ABC
    )
    [2] => Array
    (
        [name] => Supplier B
        [listingCode] => DEF
    )
    [3] => Array
    (
        [name] => Supplier C
        [listingCode] => XYZ
    )
    [4] => Array
    (
        [name] => Supplier D
        [listingCode] => BBB
    )
    [5] => Array
    (
        [name] => Supplier E
        [listingCode] => ABCDEF
    )
    [6] => Array
    (
        [name] => Supplier F
        [listingCode] => ABCDEF
    )
)

I have 1.2M records in this array. Basically when all is said and done, I just want to have elements 0, 1, 5, 6 left in the array. Is this possible?

Basically all of this data comes from 3 tables. I only want to display suppliers where any of the listingCode's may be duplicates. For instance listingCode 1,2,6,7 are duplicates, therefore display Supplier A,B,E,F

Supplier
----------------------
ID| Supplier Name
1 | Supplier A
2 | Supplier B
3 | Supplier B
4 | Supplier C
5 | Supplier D
6 | Supplier E
7 | Supplier F

Product
----------------------
ID| Product Name | Supplier ID
1 | ABC          | 1
2 | DEF          | 2
3 | GHI          | 3
4 | JKL          | 4
5 | MNO          | 5
6 | PQR          | 6 
7 | STU          | 7

Listing
----------------------
ID| Listing Code | Product ID
1 | ABC          | 1
2 | ABC          | 2
3 | DEF          | 3
4 | XYZ          | 4
5 | BBB          | 5
6 | ABCDEF       | 6 
7 | ABCDEF       | 7

Thanks

10
  • Have you tired something? Also why not keep element 3 and 4? Commented Nov 19, 2015 at 22:53
  • Duplicates can only be in the "productName". Fixed my post. I haven't tried anything, I dont know where to start. Commented Nov 19, 2015 at 22:57
  • I still don't quite get it. Product name of 0 and 1 is the same, so why do you want to keep it? Use google, use the manual try some code to get your goal until you get stuck. Commented Nov 19, 2015 at 23:01
  • You don't think Ive Googled anything or tried anything? I am stuck, thus the post. There's no reason post miles of crappy php code and clutter up something that is pretty straightforward in explanation yet which is complicated to accomplish. "ABC & ABCDEF" are the same therefore 0,1,5,6 need to be kept everything else trashed. I can't be any clearer. If you have a SOF post I can reference Im all for tying it out. Commented Nov 19, 2015 at 23:06
  • Is this data coming from a database? If that is the case, it will probably be easier to do this in your query, and almost definitely faster. Commented Nov 19, 2015 at 23:10

2 Answers 2

3

array_filter() is a standard PHP function that allows you to return a subset of array values based on a callback condition

$data = [
    ['name' => 'Supplier A', 'productName' => 'ABC'],
    ['name' => 'Supplier B', 'productName' => 'ABC'],
    ['name' => 'Supplier B', 'productName' => 'DEF'],
    ['name' => 'Supplier C', 'productName' => 'XYZ'],
    ['name' => 'Supplier D', 'productName' => 'BBB'],
    ['name' => 'Supplier E', 'productName' => 'ABCDEF'],
    ['name' => 'Supplier F', 'productName' => 'ABCDEF']
];

$result = array_filter(
    $data,
    function($value) use ($data) {
        return count(array_filter(
            $data,
            function ($match) use ($value) {
                return $match['productName'] === $value['productName'];
            }
        )) > 1;
    }
);
var_dump($result);

This loops through each array element in turn, executing a callback that counts how many duplicates there are in the original array (based on productName) and returns a true if there is more than 1 matching record, indicating that this should be retained after the filtering

and yes, it does preserve the original keys


However, an array with 1.2M records is taking an enormous amount of PHP's precious memory, and the filtering will be incredibly slow with that volume of data.... it would be far better doing this via SQL.

Sign up to request clarification or add additional context in comments.

Comments

1

This doesn't exactly answer your question, but I decided to try to offer an alternative approach that will generate a data structure that may be more usable.

foreach ($supplier_products as $item) {
    $products[$item['productName']][] = $item['name'];
}

This will yield an array with product names as keys and arrays of suppliers for each product name as values. Then if you want only the products with multiple suppliers, you can just count the suppliers in array filter:

$duplicate_products = array_filter($products, function($product) {
    return count($product) > 1; 
});

This will end up with an array like:

Array ( 
    [ABC] => Array ( 
        [0] => Supplier A 
        [1] => Supplier B 
    )
    [ABCDEF] => Array (
        [0] => Supplier E 
        [1] => Supplier F
    )
)

which, granted, is not exactly what you asked for, but in my opinion will be easier to work with.


After your edit, I think this query will get you a list of suppliers with duplicate listing codes:

SELECT
    s.supplier_name
FROM
    listing l1 
    INNER JOIN listing l2 ON l1.listing_code = l2.listing_code AND  l1.id != l2.id
    INNER JOIN product p ON l1.product_id = p.id
    INNER JOIN supplier s on p.supplier_id = s.id
GROUP BY
    s.supplier_name

1 Comment

You're my coding savior! Thanks so much.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.