I want to remove duplicate records using Entity Framework.
This is what I've tried
var result = _context.History
.GroupBy(s => new
{
s.Date,
s.EventId
})
.SelectMany(grp => grp.Skip(1)).ToList();
_context.History.RemoveRange(result);
await _context.SaveChangesAsync();
But I get an error
System.InvalidOperationException: Processing of the LINQ expression 'grp => grp.Skip(1)' by 'NavigationExpandingExpressionVisitor' failed. This may indicate either a bug or a limitation in EF Core
I understand that this is breaking change for Entity Framework, but I really don't know how to update my code.
ROW_NUMBER()that would return all multiples, ranked by whatever sort order you want, allowing you to select which row to keepwith dups as (select *, row_number() over (partition by date,eventid order by id desc) rn from...) delete dups where rn>1will delete all duplicates except the largest id. The CTE doesn't need to return all columns, just the key columns are enough. You can specify a differentORDER BYto select different rows to preservedatabase agnosticwithANSI standard, yes, it's ANSI standard and even supported in MySQL after MySQL 8. All other major databases hadROW_NUMBER()already