SQL Query for multiple columns with one column distinct

Question

I've spent an inordinate amount of time this morning trying to Google what I thought would be a simple thing. I need to set up an SQL query that selects multiple columns, but only returns one instance if one of the columns (let's call it case_number) returns duplicate rows.

select case_number, name, date_entered from ticket order by date_entered

There are rows in the ticket table that have duplicate case_number, so I want to eliminate those duplicate rows from the results and only show one instance of them. If I use "select distinct case_number, name, date_entered" it applies the distinct operator to all three fields, instead of just the case_number field. I need that logic to apply to only the case_number field and not all three. If I use "group by case_number having count (*)>1" then it returns only the duplicates, which I don't want.

Any ideas on what to do here are appreciated, thank you so much!

What you are describing doesn't have any logic. To my understanding, keeping case_number distinct is all you need and you don't care what comes in name and date_entered columns, right? — Cetin Basoz
– Cetin Basoz, Commented Oct 15, 2021 at 16:55
I'd like to return all three columns so I can put in a spreadsheet, but eliminate duplicate rows that would be caused by duplicate rows with the case_number in the table. Hope that makes sense. — mo3bius
– mo3bius, Commented Oct 15, 2021 at 17:04
You should research windowing functions (SQL Partition by) in order to eliminate the duplicates. It should sort your duplicates by your preference and you can rank the sortig. Then only pick rank = 1. — NonProgrammer
– NonProgrammer, Commented Oct 15, 2021 at 17:08
@mo3bius, you didn't understand what I am saying. When there is a case_number duplicate, from which row the other column's content would come? If they were all same "select distinct ..." would be the answer. IOW, define your logic and add some sample data and expected output. — Cetin Basoz
– Cetin Basoz, Commented Oct 15, 2021 at 17:39
Why don't you have a central ticket table with case_number as the primary key? It seems you are actually dealing with a ticket_contents table, hence the multiple entries? Maybe you should either use another already existing table, or should consider a redesign of your schema. — Kaii
– Kaii, Commented Oct 15, 2021 at 17:57

The Impaler · Accepted Answer · 2021-10-15 17:24:51Z

3

You can use ROW_NUMBER(). For example

select *
from (
  select *,
    row_number() over(partition by case_number) as rn
) x
where rn = 1

The query above will pseudo-randomly pick one row for each case_number. If you want a better selection criteria you can add ORDER BY or window frames to the OVER clause.

edited Oct 15, 2021 at 17:24

answered Oct 15, 2021 at 16:55

The Impaler

49.3k10 gold badges50 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

mo3bius Over a year ago

Thanks so much. I'm trying to get this syntax figured out for my specific application, but I think this is going to be it.

Collectives™ on Stack Overflow

SQL Query for multiple columns with one column distinct

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related