0

Working in C# with a custom object that has multiple rows and columns.

I need to remove duplicate records where some (but not all) of the column values match, including positive and negative values of the same number.

This is to identify orders that have been charged and reversed. So the data would be like so:

CustomerId     OrderNo     OrderDt         Qty    ProductNo
1234123        6098        10/17/2025      2      32166
1234123        6098        10/17/2025     -2      32166
1234123        6098        10/17/2025      2      32166
9187324        6457        10/16/2025      10     97846
2087930        9875        10/11/2025      5      65655
2093483        3165        10/13/2025      3      89645
2093483        3165        10/13/2025     -3      89645
9784652        9784        10/14/2025      6      13246

And what I want the final object to contain would be:

CustomerId     OrderNo     OrderDt         Qty    ProductNo
1234123        6098        10/17/2025      2      32166
9187324        6457        10/16/2025      10     97846
2087930        9875        10/11/2025      5      65655
9784652        9784        10/14/2025      6      13246

(There are other fields not relevant to duplicate status - address, phone #, etc.)

Where the four values are identical and one quantity is a positive and one a negative of the same value, I want both rows removed.

If the item is charged, reversed, and then charged again I want to keep one row - rows will only be removed in sets of 2 matches.

I'm grouping and selecting by the customer #, order #, order date, and qty (thanks to Jon Skeet). I assume I need to make a copy of the object for comparison, but I'm at a loss as to how to remove both rows instead of just the one.

        var customerId = orderData.FindField("CustomerID");
        var orderNo = orderData.FindField("OrderNo");
        var orderDate = orderData.FindField("OrderDt");
        var qty = orderData.FindField("Qty");

        var distinctOrders = orderData.Rows
            .GroupBy(x => new
                { CustomerID = x.GetDataById(customerId.Id).ToString(),
                    OrderNo = x.GetDataById(orderNo.Id).ToString(),
                    OrderDt = x.GetDataById(orderDate.Id).ToString(),
                    Qty = Math.Abs(decimal.Parse(x.GetDataById(qty.Id).ToString()))
                })
            .Select(o => o.Select(t => new
                { CustomerID = t.GetDataById(customerId.Id).ToString(),
                    OrderNo = t.GetDataById(orderNo.Id).ToString(),
                    OrderDt = t.GetDataById(orderDate.Id).ToString(),
                    Qty = Math.Abs(decimal.Parse(x.GetDataById(qty.Id).ToString()))
                })
                .FirstOrDefault());

        foreach (var distinctOrder in distinctOrders)
        {
            OrderData duplicateOrderData = orderData;

        }
3
  • Why not just use GroupBy using Math.Abs for the value where you want positive and negative values to be treated the same? (It's unclear why you're converting everything into strings, mind you...) Commented 2 days ago
  • The "why" is probably to update inventory; in which case all you need to do is "sum" / group on ProductNo which will give you the change in inventory. They're not "duplicates"; they're transactions. Ultimately, the OrderNo identifies the transaction; the client and date are redundant. Commented 2 days ago
  • Can we just sum up quantities (2 - 2 + 2 == 2; 3 - 3 == 0)? And then, and if computed quantity is 0 remove such record(s)? Commented yesterday

3 Answers 3

2

LINQ is really good if you want to look only at one item at once. If you want to look at several items simultaneously, it becomes really nasty. That's why I recommend just two classical loops:

for(int i = data.Count - 1; i >= 0; i--)
        {
            for(int j = data.Count - 1; j > i; j--)
            {
                
                if(data[i].CustomerId == data[j].CustomerId &&
                   data[i].OrderNo == data[j].OrderNo &&
                   data[i].ProductNo == data[j].ProductNo &&
                   data[i].Qty == -1*data[j].Qty)
                {
                    data.RemoveAt(i);
                    data.RemoveAt(j-1);
                }
            }
        }

We loop backwards because when removing, the indices of the remaining list change. In the inner of the two loops, we check if the pair of items matches the condition: all properties the same, and Qty being the inverse. If the condition is matched, the two items are removed. i will be smaller than j, so after removing the item at index i, all the indices after i are lowered by 1, hence we have to remove j-1.

Online-demo: https://dotnetfiddle.net/glb3Q1

Sign up to request clarification or add additional context in comments.

Comments

2

Something like below should give what you want. Please note that reading all the records in the table in production won't be a good idea. This approach is just to demonstrate how to filter the data so that the returned dataset only includes open (not refunded or reversed) orders.

var repo = new OrderRepository(db);

// Read all data in the table
var all = await repo.GetAllAsync();

// group orderData by CustomerId, OrderNo, OrderDt, ProductNo and sum Qty. Filter to only those with total Qty > 0.
var grouped = all
    .GroupBy(o => new { o.CustomerId, o.OrderNo, o.OrderDt, o.ProductNo })
    .Select(g => new
    {
        g.Key.CustomerId,
        g.Key.OrderNo,
        g.Key.OrderDt,
        g.Key.ProductNo,
        TotalQty = g.Sum(x => x.Qty)
    })
    .Where(x => x.TotalQty > 0)
    .ToList();

foreach (var item in grouped)
{
    Console.WriteLine($"CustomerId={item.CustomerId}, OrderNo={item.OrderNo}, OrderDate={item.OrderDt}, ProductNo={item.ProductNo}, TotalQty={item.TotalQty}");
}

Code block returns the dataset below when run with the given data.

enter image description here

Comments

-2

Para remover basta fazer:

var distinctOrders = orders
    .GroupBy(x => (x.CustomerID, x.OrderNo, x.OrderDt))
    .Select(g => g.First())
    .ToList();
New contributor
Leticia is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

1 Comment

Please answer questions in English

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.