2

I've spent an inordinate amount of time this morning trying to Google what I thought would be a simple thing. I need to set up an SQL query that selects multiple columns, but only returns one instance if one of the columns (let's call it case_number) returns duplicate rows.

select case_number, name, date_entered from ticket order by date_entered

There are rows in the ticket table that have duplicate case_number, so I want to eliminate those duplicate rows from the results and only show one instance of them. If I use "select distinct case_number, name, date_entered" it applies the distinct operator to all three fields, instead of just the case_number field. I need that logic to apply to only the case_number field and not all three. If I use "group by case_number having count (*)>1" then it returns only the duplicates, which I don't want.

Any ideas on what to do here are appreciated, thank you so much!

7
  • What you are describing doesn't have any logic. To my understanding, keeping case_number distinct is all you need and you don't care what comes in name and date_entered columns, right? Commented Oct 15, 2021 at 16:55
  • I'd like to return all three columns so I can put in a spreadsheet, but eliminate duplicate rows that would be caused by duplicate rows with the case_number in the table. Hope that makes sense. Commented Oct 15, 2021 at 17:04
  • You should research windowing functions (SQL Partition by) in order to eliminate the duplicates. It should sort your duplicates by your preference and you can rank the sortig. Then only pick rank = 1. Commented Oct 15, 2021 at 17:08
  • @mo3bius, you didn't understand what I am saying. When there is a case_number duplicate, from which row the other column's content would come? If they were all same "select distinct ..." would be the answer. IOW, define your logic and add some sample data and expected output. Commented Oct 15, 2021 at 17:39
  • Why don't you have a central ticket table with case_number as the primary key? It seems you are actually dealing with a ticket_contents table, hence the multiple entries? Maybe you should either use another already existing table, or should consider a redesign of your schema. Commented Oct 15, 2021 at 17:57

1 Answer 1

3

You can use ROW_NUMBER(). For example

select *
from (
  select *,
    row_number() over(partition by case_number) as rn
) x
where rn = 1

The query above will pseudo-randomly pick one row for each case_number. If you want a better selection criteria you can add ORDER BY or window frames to the OVER clause.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks so much. I'm trying to get this syntax figured out for my specific application, but I think this is going to be it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.