2

I'm trying to return the rows showing OrderIds which have duplicate values in the Notes column, ie different OrderIds which have the same value in the Notes column.

I am using this query:

SELECT 
    [OrderId],
    [Notes]
FROM
    [ord].[LineItems]
GROUP BY 
    OrderId, Notes
HAVING 
    COUNT(Notes) > 1;

The problem with the above query is that it is returning false-positives, ie rows which have duplicates in the Notes column, but they have the same OrderId (there can sometimes be duplicate rows showing the same OrderIds and Notes values, because there are different values in other fields).

I only want it to return unique OrderIds which have the duplicated values in the Notes column. How can I remove these false-positives?

Sorry this has been hard to explain, and the database I'm working with isn't well normalized.

2 Answers 2

3

You should be aggregating by Notes and then checking the distinct count of orders:

WITH cte AS (
    SELECT Notes
    FROM LineItems
    GROUP BY Notes
    HAVING MAX(OrderId) <> MIN(OrderId)
)

SELECT *
FROM LineItems li
WHERE EXISTS (SELECT 1 FROM cte t WHERE t.Notes = li.Notes);
Sign up to request clarification or add additional context in comments.

Comments

1

You only need EXISTS to get all the duplicate Notes:

SELECT o.[OrderId], o.[Notes]
FROM [ord].[LineItems] o
WHERE EXISTS (
  SELECT 1 FROM [ord].[LineItems]
  WHERE [Notes] = o.[Notes] AND [OrderId] <> o.[OrderId]
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.