Selecting unique rows based on multiple columns but not all

Question

I am trying to sort a table so that duplicates dont show up on it based on two different columns (MODEL_NUMBER and YEAR_INTRODUCED)

Right now my query is designed like:

cmd = @"Select * From ARCHIVE_DECADE_TBL WHERE DECADE_" + decade + @"=@decade AND PRODUCT_LINE=@Line AND QUANTITY is not null AND QUANTITY <> 0 ORDER BY PRODUCT_NAME;";

Here is the table layout:

ARCHIVE_ID    MODEL_NUMBER    YEAR_INTRODUCED    LOCATION
1001          B10             1989               SKID 43
1002          B10             1989               SKID 48
1003          B10             1989               SKID 73

The ARCHIVE_ID is the primary key. Should I use a group by? If I do use a group by which ARCHIVE_ID would stay?

It depends on the aggregate function (min, max, count, ...) which archive_id would stay. — maraca
– maraca, Commented Jun 20, 2016 at 22:13

John Wu · Accepted Answer · 2016-06-20 23:48:58Z

3

Depends on the result set that you wish.

If the resultset only contains MODEL_NUMBER and YEAR_INTRODUCED, you can simply use distinct:

SELECT DISTINCT 
     MODEL_NUMBER, 
     YEAR_INTRODUCED 
FROM ARCHIVE_DECADE_TBL

If you want the resultset to include other columns, you have to decide which values you want to show up. Since you only have one row per unique pairing, you can only show one value from the other columns. Which one do you want to show up? And do the values need to come from the same row?

You could do something like

SELECT   MIN(ARCHIVE_ID), 
         MODEL_NUMBER, 
         YEAR_INTRODUCED, 
         MIN(LOCATION)
FROM     ARCHIVE_DECADE_TBL 
GROUP BY MODEL_NUMBER, 
         YEAR_INTRODUCED

...if you don't care if the values come from the same row.

If you do care, you have to do something a little more complicated, such as

SELECT A.* 
FROM   ARCHIVE_DECADE_TBL A
JOIN   (SELECT   MIN(ARCHIVE_ID), 
                 MODEL_NUMBER, 
                 YEAR_INTRODUCED 
        FROM     ARCHIVE_DECADE_TBL 
        GROUP BY MODEL_NUMBER, 
                 YEAR_INTRODUCED) B
ON     A.ARCHIVE_ID = B.ARCHIVE_ID

answered Jun 20, 2016 at 23:48

John Wu

52.5k8 gold badges50 silver badges92 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ben Cavenagh Over a year ago

When I try the second option the query changes the column name from LOCATION to Expr1000, do you know how I can solve that?

John Wu Over a year ago

You need to add a column alias, e.g. if you want to name the column MIN_LOCATION you would use SELECT MIN(LOCATION) MIN_LOCATION FROM etc

Collectives™ on Stack Overflow

Selecting unique rows based on multiple columns but not all

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related