I am trying to do a MySQL SELECT with two tables that results in showing columns from each of the tables only when there are multiple rows with two of the columns have duplicated values.
The following SELECT statement gives customer_id, order_id, product_ids and skus for all orders placed by a customer...
select orders2.customer_id, items2.order_id, items2.product_id, items2.sku
from orders orders2, order_items items2
where orders2.status = 'complete'
and orders2.customer_id is not null
and orders2.entity_id = items2.order_id
order by orders2.customer_id ASC, items2.product_id ASC;
Giving me results of...
+-------------+----------+------------+--------------------------+
| customer_id | order_id | product_id | sku |
+-------------+----------+------------+--------------------------+
| 29813 | 38025 | 306 | BB_MAT101 |
| 29813 | 38027 | 309 | BB_MAT250 |
| 29814 | 28844 | 302 | BB_ENG101 |
| 29814 | 27615 | 384 | BB_MS-ACC101 |
| 29814 | 27616 | 385 | BB_MS-ACC102 |
| 29814 | 27615 | 385 | BB_MS-ACC102 |
| 29814 | 27614 | 409 | BB_MS-MAT101 |
| 29814 | 27584 | 410 | BB_MS-MAT150 |
| 29815 | 27592 | 384 | BB_MS-ACC101 |
| 29815 | 27593 | 384 | BB_MS-ACC101 |
| 29815 | 27594 | 384 | BB_MS-ACC101 |
| 29815 | 27599 | 385 | BB_MS-ACC102 |
| 29815 | 27592 | 402 | BB_MS-ECON101 |
| 29815 | 27593 | 402 | BB_MS-ECON101 |
| 29815 | 27594 | 402 | BB_MS-ECON101 |
| 29815 | 27596 | 403 | BB_MS-ECON102 |
| 29815 | 27598 | 404 | BB_MS-ENG099 |
| 29815 | 27588 | 405 | BB_MS-ENG101 |
| 29815 | 27595 | 406 | BB_MS-ENG102 |
| 29815 | 27589 | 408 | BB_MS-MAT099 |
| 29815 | 27585 | 409 | BB_MS-MAT101 |
| 29815 | 27589 | 410 | BB_MS-MAT150 |
| 29815 | 27589 | 411 | BB_MS-MAT201 |
+-------------+----------+------------+--------------------------+
The following SELECT lets me determine those cases where a customer has ordered a product more than once...
select orders1.customer_id as dupe_customer_id, items1.product_id as dupe_product_id, count(*) as duplicates
from orders orders1, order_items items1
where orders1.status = 'complete'
and orders1.customer_id is not null
and orders1.entity_id = items1.order_id
group by orders1.customer_id, items1.product_id
having duplicates > 1;
With results of...
+------------------+-----------------+------------+
| dupe_customer_id | dupe_product_id | duplicates |
+------------------+-----------------+------------+
| 29814 | 385 | 2 |
| 29815 | 384 | 3 |
| 29815 | 402 | 3 |
+------------------+-----------------+------------+
What I am trying to puzzle through is how to combine these so that I only get those items in the FIRST select where they meet the conditions of the second select, such that the output would be something like...
+-------------+----------+------------+--------------------------+
| customer_id | order_id | product_id | sku |
+-------------+----------+------------+--------------------------+
| 29814 | 27616 | 385 | BB_MS-ACC102 |
| 29814 | 27615 | 385 | BB_MS-ACC102 |
| 29815 | 27592 | 384 | BB_MS-ACC101 |
| 29815 | 27593 | 384 | BB_MS-ACC101 |
| 29815 | 27594 | 384 | BB_MS-ACC101 |
| 29815 | 27592 | 402 | BB_MS-ECON101 |
| 29815 | 27593 | 402 | BB_MS-ECON101 |
| 29815 | 27594 | 402 | BB_MS-ECON101 |
+-------------+----------+------------+--------------------------+
Note that the main criteria of the result set is that any combination of customer_id AND product_id MUST occur more than once.
I'm struggling with how to combine them. My attempts have resulted in one (or more) of the columns have incorrect data repeated across every row in the results.
I've hit the wall with my MySQL knowledge on doing this type of SELECT and searches over the past hour have not yielded anything.
select a.col1, b.col2 ... from table a ... join (select ... ) as b ...works in MySQL, as far as I know. Probably quite inefficient, but once you've got a working query, you can work on indexes, if needed.