0

If I have a table like this

fid name   date
---------------------
1   John1  2020-10-08
1   John2  2020-10-08
1   John3  2018-06-04
2   Tom1   2019-10-08

I want to preserve the row for each fid that has the most recent date. However, if there are multiple, keep only 1 (any of them is fine). So the final result should end up like

fid name   date
---------------------
1   John1  2020-10-08
2   Tom1   2019-10-08

Does anyone know how to do this in SQL Server? I use v14 (2017) if that matters.

Problem is if I group by fid and do Max(date), I get 1 record per fid, but then when I left join on it to get the other columns it gives me back 2 records since the most recent date shows up twice.

2
  • First it sounds like you want to remove rows from your table ("remove duplicates"), then it sounds like you merely want to select rows without duplicates ("to get the other columns"). Which do you want? DELETE or SELECT? Commented Oct 22, 2021 at 6:53
  • Not delete original data, just get a query that doesn't include the ones I want gone. So I want to select. Commented Oct 22, 2021 at 13:51

1 Answer 1

9

We can use a deletable CTE along with ROW_NUMBER here:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY fid ORDER BY date DESC, name) rn
    FROM yourTable
)

DELETE
FROM cte
WHERE rn > 1;

The above logic will assign rn = 1 (i.e. spare) the record with the most recent date, per group of fid records. Should two records with the same fid also have the same latest date, then it spares the earlier name.

Sign up to request clarification or add additional context in comments.

3 Comments

Note that I want to just select, not delete any actual data.
@omega: Then replace DELETE with SELECT * (or rather the columns you want to select).
@omega Then use the same CTE but do SELECT * FROM cte WHERE rn = 1

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.