0

Below is my source table

    Name              Sales
---------------------------------
    Thomas             100
    Jay                200
    Thomas             100
    Mathew              50

Output I need is as below

    Name              Sales
---------------------------------
    Thomas             200
    Jay                200
    Mathew              50

Basically, I have 2 columns that can have duplicates and I need to aggregate the second column based on first column.

Current code I have is as below. Its working perfectly fine. It takes around 45 seconds to run for 4500 records. I was wondering if there is a more efficient way to do this... as it seems to be a trivial requirement.

'Combine duplicate rows and sum values

Dim Rng As Range
Dim LngRow As Long, i As Long

LngLastRow = lRow 'The last row is calculated somewhere above...

'Initializing the first row
i = 1

'Looping until blank cell is encountered in first column
While Not Cells(i, 1).Value = ""

    'Initializing range object
    Set Rng = Cells(i, 1)

    'Looping from last row to specified first row
    For LngRow = LngLastRow To (i + 1) Step -1

        'Checking whether value in the cell is equal to specified cell
        If Cells(LngRow, 1).Value = Rng.Value Then
            Rng.Offset(0, 1).Value = Rng.Offset(0, 1).Value + Cells(LngRow, 2).Value
            Rows(LngRow).Delete
        End If

    Next LngRow

    i = i + 1

Wend

Note that this is part of a larger excel app and hence I definitely need the solution to be in Excel VBA.

9
  • 2
    To be honest, you're probably better off asking optimization questions on Code Review, if the code already works. Commented May 21, 2019 at 21:12
  • Loop through all the cells with a dictionary, if the item exists, add the values, if it doesn't add the item and it's value. When it's done just paste the keys and items from the dictionary. That should take barely 5 seconds Commented May 21, 2019 at 21:16
  • @BigBen Thanks for the tip, I was unaware of the same. Will a moderator move it or do I need to create a duplicate question there? Commented May 21, 2019 at 21:18
  • Thanks @Damian. Let me do some research on Dictionary... Never used it. Appreciate if someone can help with the implementation. Commented May 21, 2019 at 21:26
  • Also turn screenupdating off Commented May 21, 2019 at 21:31

2 Answers 2

0

Here you go:

Option Explicit
Sub Consolidate()

    Dim arrData As Variant
    Dim i As Long
    Dim Sales As New Scripting.Dictionary 'You will need the library Microsoft Scripting Runtime

    Application.ScreenUpdating = False 'speed up the code since excel won't show you what is happening

    'First of all, working on arrays always speeds up a lot the code because you are working on memory
    'instead of working with the sheets
    With ThisWorkbook.Sheets("YourSheet") 'change this
        i = .Cells(.Rows.Count, 1).End(xlUp).Row 'last row on column A
        arrData = .Range("A2", .Cells(i, 2)).Value 'here im assuming your row 1 has headers and we are storing the data into an array
    End With

    'Then we create a dictionary with the data
    For i = 1 To UBound(arrData) 'from row 2 to the last on Q1 (the highest)
        If Not Sales.Exists(arrData(i, 1)) Then
            Sales.Add arrData(i, 1), arrData(i, 2) 'We add the worker(Key) with his sales(Item)
        Else
            Sales(arrData(i, 1)) = Sales(arrData(i, 1)) + arrData(i, 2) 'if the worker already exists, sum his sales
        End If
    Next i

    'Now you have all the workers just once
    'If you want to delete column A and B and just leave the consolidate data:
    With ThisWorkbook.Sheets("YourSheet") 'change this
        i = .Cells(.Rows.Count, 1).End(xlUp).Row 'last row on column A
        .Range("A2:B" & i).ClearContents
        .Cells(2, 1).Resize(Sales.Count) = Application.Transpose(Sales.Keys) 'workers
        .Cells(2, 2).Resize(Sales.Count) = Application.Transpose(Sales.Items) 'Their sales
    End With

    Application.ScreenUpdating = True 'return excel to normal

End Sub

To learn everything about dictionaries (and more) check this

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks so much for the code and explanation. It helped a lot!
0

With data in cols A and B like:

enter image description here

Running this short macro:

Sub KopyII()
    Dim cell As Range, N As Long

    Columns("A:A").Copy Range("C1")
    ActiveSheet.Range("C:C").RemoveDuplicates Columns:=1, Header:=xlNo
    N = Cells(Rows.Count, "C").End(xlUp).Row

    Range("B1").Copy Range("D1")

    Range("D2:D" & N).Formula = "=SUMPRODUCT(--(A:A= C2),(B:B))"
End Sub

will produce this in cols C and D:

enter image description here

NOTE:

This relies on Excel's builtin RemoveDuplicates feature.

EDIT#1:

As chris neilsen points out, this function should be a bit quicker to evaluate:

Sub KopyIII()
    Dim cell As Range, N As Long, A As Range, C As Range
    Set A = Range("A:A")
    Set C = Range("C:C")

    A.Copy C
    C.RemoveDuplicates Columns:=1, Header:=xlNo
    N = Cells(Rows.Count, "C").End(xlUp).Row

    Range("B1").Copy Range("D1")  ' the header

    Range("D2:D" & N).Formula = "=SUMIFS(B:B,A:A,C2)"
End Sub

3 Comments

This looks good (and compact)! I already implemented Damian's solution and it brought down my running time from 45 sec to 4 seconds. So, I am currently sticking with that. But this is a good learning. Thanks!
Using SumProduct on whole columns is a bad idea. See Here (from the conclusion: Formulas other than array formulas and SUMPRODUCT handle whole column references efficiently. ) Use the much simpler =SUMIFS(B:B,A:A,C2) instead
@chrisneilsen I agree and I updated the Answer.........................Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.