0

I have a table of products. These products have a product id. The product id can be shared by products. They are differentiated by the version id. For example bike has an id of 1. There are different kinds of bikes that share p_id of 1 but have v_id of 1-15. In the table there are many duplicates and i would like to find them. I made a query that finds the duplicates of the products. So i can see the count of versions of the product, but some of the products were inserted twice. I wrote a query to view the count of versions of the products, but now i would like to see all the duplicates. There is a column called product Value (the description) which is the indicator if its a duplicate.

Example

productKey  productValue    cout
16293   Bike    2
16292   Bike    2
16291   Bike    2
16290   Pads    2
16289   Pads    2
16288   Helmet  3
16286   Shoe    2

From here you can see bike and pads show up 3 times, but Helmet & Shoe only show up once. I would like to edit my duplicate query (or add to it), so that i can just get the items that show up more than once (aka Helmet and shoe would not appear ) The third column is the version, but that can be ignored.

Query

SELECT productKey, productValue, COUNT(*) as cout
FROM [Store].[dbo].[products]
GROUP BY productKey, productValue
HAVING COUNT(*) > 1
ORDER BY productKey DESC

Wanted result, something like this

productKey  productValue    cout
16293   Bike    2
16292   Bike    2
16291   Bike    2
16290   Pads    2
16289   Pads    2
8
  • Post the expected result Commented Sep 29, 2017 at 14:16
  • cout would be the versions Commented Sep 29, 2017 at 14:19
  • The query you wrote seems correct... Wht is the problem?? Commented Sep 29, 2017 at 14:19
  • @Rams, it inculcates all the products. There will be duplicate keys no matter what because of versions. So the query i wrote finds them. It does nothing about the actual duplicates like 3 bikes. Commented Sep 29, 2017 at 14:23
  • you are trying to query for duplicates of product key and thier version and then get a count of these? Commented Sep 29, 2017 at 14:28

6 Answers 6

1

You want to find products where two versions refer to the same thing. (At least this is what your query does.) And from these products that have duplicate versions you want to select only those products where the product title occurs in at least two problem products. That doesn't seem to make much sense, but here you go:

SELECT productKey, productValue, cout
FROM
(
  SELECT 
    productKey, 
    productValue, 
    COUNT(*) as cout,
    COUNT(*) OVER (PARTITION BY productValue) as cnt
  FROM Store.dbo.products
  GROUP BY productKey, productValue
  HAVING COUNT(*) > 1
) counted
WHERE cnt > 1
ORDER BY productKey DESC;
Sign up to request clarification or add additional context in comments.

1 Comment

This is really close, it clears out some of the no-duplicates, but still some remain (maybe something else is going on behind the scenes). I'll see if i cant edit this query to complete remove those duplicates.
1

Use Count() Over() window aggregate function

Select * from 
(
select *,
       cout = Count(1)over(partition by productValue)
from [Store].[dbo].[products]
) a
Wher cnt > 1

If you want to use Group By then you need sub-query

Select * 
from  [Store].[dbo].[products] 
where productValue in (SELECT productValue 
                       FROM [Store].[dbo].[products] 
                       GROUP BY productValue HAVING COUNT(*) > 1) 

2 Comments

Yes, I think John is mistaken when he thinks that his query can be built upon to achieve what he wants. It seems he is really merely looking for duplicate product values, so you did well in ignoring his elaboration :-)
@ThorstenKettner - yep, thats why I asked him for expected result. All he want to do is, return all the records of a productValue if it is present more than once
0
You can select duplicate rows in subquery and select them in your main query



  SELECT 
         productKey, productValue,cout
    FROM 
         [Store].[dbo].[products] p1
    WHERE 
         productKey IN( 
               SELECT productKey
               FROM [Store].[dbo].[products]
               GROUP BY productKey, productValue
               HAVING COUNT(*) > 1
               )

Comments

0

Here you go:

DECLARE @T TABLE (productKey INT, productValue VARCHAR(30), cout INT);
INSERT INTO @T VALUES
(16293,   'Bike',  2),
(16292,   'Bike',  2),
(16291,   'Bike',  2),
(16290,   'Pads',  2),
(16289,   'Pads',  2),
(16288,   'Helmet',  3),
(16286,   'Shoe',  2);

SELECT *
FROM @T
WHERE  productValue IN (SELECT productValue FROM (SELECT productValue,COUNT(productValue) N FROM @T GROUP BY productValue) AS T WHERE T.N > 1) ;

Result:

+------------+--------------+------+
| productKey | productValue | cout |
+------------+--------------+------+
|      16293 | Bike         |    2 |
|      16292 | Bike         |    2 |
|      16291 | Bike         |    2 |
|      16290 | Pads         |    2 |
|      16289 | Pads         |    2 |
+------------+--------------+------+

Comments

0

What you really want (from what I understand now): Duplicate product names, but ignoring duplicates within a product number, because these are versions of the same product and hence of course not considered a problem.

So you are looking for product names where you count more than one product number. Use COUNT(DISTINCT ProductKey) for this.

select * 
from
(
  select products.*, count(distinct productkey) over (partition by productValue) as cnt
  from products
) counted
where cnt > 1;

2 Comments

I get the following error, Incorrect syntax near 'distinct'. I know it says this is legal on the sql forms, but i cant get it to not give me this error. maybe it has something to do with the over
It turns out that COUNT OVER is very restricted in SQL Server and doesn't allow for DISTINCT. But there is an amazing solution for this. See here: stackoverflow.com/questions/11202878/…
0

Please try below SQL Query which could help you to achieve the result :

;WITH CTE
     AS (
     SELECT productKey,
            productValue,
            COUNT(*) OVER(PARTITION BY productValue) AS ActualCnt
     FROM products
     GROUP BY productKey,
              productValue)
     SELECT T.productKey,
            T.productValue,
            T.cout
     FROM CTE C
          INNER JOIN products T ON T.productValue = C.productValue
     WHERE C.ActualCnt > 1
     GROUP BY T.productKey,
              T.productValue,
              T.cout;

Desired Output :

productKey  productValue   cout
----------- ------------- -------
16289       Pads           2
16290       Pads           2
16291       Bike           2
16292       Bike           2
16293       Bike           2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.