0

I have a datatable called dtstore with 4 columns called section, department, palletnumber and uniquenumber. I am trying to make a new datatable called dtmulti which has an extra column called multi which shows the count for the number of duplicate rows...

dtstore

section | department | palletnumber | batchnumber

---------------------------------------------------

 pipes      2012          1234           21

 taps       2011          5678           345

 pipes      2012          1234           21

 taps       2011          5678           345

 taps       2011          5678           345

 plugs      2009          7643           63


dtmulti

section | department | palletnumber | batchnumber | multi

----------------------------------------------------------

 pipes      2012          1234           21           2

 taps       2011          5678           345          3

I have tried lots of approaches but my code always feels clumsy and bloated, is there an efficient way to do this?

Here is the code I am using:

Dim dupmulti = dataTable.AsEnumerable().GroupBy(Function(i) i).Where(Function(g) g.Count() = 2).Select(Function(g) g.Key)  

For Each row In dupmulti multirow("Section")  = dup("Section") 

multirow("Department") = dup("Department") 
multirow("PalletNumber") = dup("PalletNumber") 
multirow("BatchNumber") = dup("BatchNumber") 
multirow("Multi") = 2
    Next
3
  • 1
    What code have you tried? Commented Sep 13, 2015 at 16:43
  • Put the code in your question not the comment section. Commented Sep 13, 2015 at 16:49
  • This specific code is plainly wrong. Firstly, it does not compile (and you are not even using LINQ right with the RowData type) and secondly it doesn't even try to deliver what you want (looking for rows repeated only two times?!). Anyway... will post a working code. Commented Sep 13, 2015 at 18:19

1 Answer 1

2

Assumptions of the code below these lines: the DataTable containing the original information is called dup. It might contain any number of duplicates and all of them can be defined by just looking at the first column.

'Creating final table from the columns in the original table
Dim multirow As DataTable = New DataTable

For Each col As DataColumn In dup.Columns
   multirow.Columns.Add(col.ColumnName, col.DataType)
Next
multirow.Columns.Add("multi", GetType(Integer))

'Looping though the groupped rows (= no duplicates) on account of the first column
For Each groups In dup.AsEnumerable().GroupBy(Function(x) x(0))

    multirow.Rows.Add()

    'Adding all the cells in the corresponding row except the last one
    For c As Integer = 0 To dup.Columns.Count - 1
        multirow(multirow.Rows.Count - 1)(c) = groups(0)(c)
    Next

    'Adding the last cell (duplicates count) 
    multirow(multirow.Rows.Count - 1)(multirow.Columns.Count - 1) = groups.Count

Next
Sign up to request clarification or add additional context in comments.

2 Comments

duplicates can not be defined by looking just at the first column, all columns need to match for it to be a duplicate.
@ThickAsAPlank I think that writing this code from what you were providing (mainly by bearing in mind that SO is not a custom code writing service) is already a quite good answer; you should take it as a first solid step to build your own code, rather than keep requesting. Additionally, in your sample data/code all the duplicate cells were identical. You can easily find references to multi-column grouping. For example: stackoverflow.com/questions/11121303/… Just update the Function(x) x(0) by including all the columns you want.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.