1

I have a very large data set (600,000 rows) of data structured in the following format:

1) There are around 60 products. One is a Total US number, while the others are for Manufacturers and are labled as KMFs. There are also some labeled as PCKGs(but aren't relevant for this question)

2) Each product is located in 60 different markets

3) Each market has 20 different locations

4) I have 12 metrics for which I need to calculate data in the following manner: Total US number - sum(KMFs) for each metric

I have written vba code for this but it is taking too long to run(around 20 minutes) I need to run similar code on at least 20 worksheets. I have tried various methods such as setting screenUpdating etc. to false. Here is my code. I am new to vba coding so I may have missed obvious things. Please let me know anything is unclear. Please help!

Sub beforeRunningCode()
    Application.ScreenUpdating = False
    Application.DisplayStatusBar = False
    Application.Calculation = xlCalculationManual
    Application.EnableEvents = False
    ActiveSheet.DisplayPageBreaks = False
End Sub
Sub returnToOriginal()
    Application.ScreenUpdating = screenUpdateState
    Application.DisplayStatusBar = statusBarState
    Application.Calculation = calcState
    Application.EnableEvents = eventsState
    ActiveSheet.DisplayPageBreaks = displayPageBreaksState
End Sub
Function LastRowFunc(Sheet) As Long
    LastRowFunc = ActiveWorkbook.Worksheets(Sheet).Range("A2", Worksheets(Sheet).Range("A2").End(xlDown)).Rows.Count
End Function
Function LastColFunc(Sheet) As Long
    With ActiveSheet
        LastColFunc = ActiveWorkbook.Sheets(Sheet).Cells(1, .Columns.Count).End(xlToLeft).Column
    End With
End Function
Sub AOCalculate()
    Call beforeRunningCode 'Optimize Excel
    Dim LastRow As Long
    Dim LastCol As Long
    Dim Period As String
    Dim Sheet As String
    Dim Arr(1 To 16)
    Dim Count As Integer
    Sheet = "Energy_LS_Bottler"
    Period = "2016 WAVE 1 - 3 W/E 05/07"
    LastRow = LastRowFunc(Sheet) 'Calculate last row for which data exists
    LastCol = LastColFunc(Sheet) 'Calculate last column for which data exists
    For Each Location In ActiveWorkbook.Sheets("Locations").Range("D7:D28").Value
        For Each Market In ActiveWorkbook.Sheets("Markets").Range("A5:A92").Value
            Count = Count + 1
            Arr(1) = Market
            Arr(2) = "AO"
            Arr(3) = Location
            Arr(4) = Period
            With ActiveWorkbook.Sheets(Sheet) 'Filtering for KMF
                .AutoFilterMode = False
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter field:=17, Criteria1:="=KMF"
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter field:=1, Criteria1:=Market
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter field:=3, Criteria1:=Location
            End With
            For k = 5 To 16
                    Arr(k) = Application.WorksheetFunction.Sum(ActiveWorkbook.Sheets(Sheet).Range(Cells(1, k), Cells(LastRow, k)).SpecialCells(xlCellTypeVisible))
            Next k
            With ActiveWorkbook.Sheets(Sheet) ' filtering for Total US
                .AutoFilterMode = False
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter field:=17, Criteria1:="=Total US"
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter field:=1, Criteria1:=Market
                .Range(Cells(1, 1), Cells(LastRow, LastCol)).AutoFilter field:=3, Criteria1:=Location
            End With
            For k = 5 To 16
                Arr(k) = -Arr(k) + Application.WorksheetFunction.Sum(ActiveWorkbook.Sheets(Sheet).Range(Cells(1, k), Cells(LastRow, k)).SpecialCells(xlCellTypeVisible))
            Next k
            For j = 1 To 16
                ActiveWorkbook.Sheets(Sheet).Cells(LastRow + Count, j).Value = Arr(j)
            Next j
            Erase Arr
        Next
    Next
    ActiveWorkbook.Sheets(Sheet).AutoFilterMode = False
    Call returnToOriginal


End Sub

[Edit]: Here is a link to a sample data set https://drive.google.com/file/d/0B3MkGa57h6g_WGl2WWlWekd4NU0/view?usp=sharing

2
  • 1
    Can you post redacted sample data (perhaps ~100 rows) to a public file share? Commented Jul 29, 2016 at 19:14
  • 8
    Stop using visual/excel methods like .autofilter in vba. The methodology you should follow for speed is 1) copy everything you need from excel cells into VBA arrays. 2) do all calculations in VBA without calling excel. 3) copy all results from arrays out to the cells. Commented Jul 29, 2016 at 19:19

1 Answer 1

2

I think that this will work (though I haven't had a chance to test it), and should be a lot faster:

Sub AOCalculate()
    Call beforeRunningCode 'Optimize Excel
    Dim LastRow As Long
    Dim LastCol As Long
    Dim Period As String
    Dim Sheet As String
    Dim Arr()   '1 To 2000, 1 To 16)
    Dim Count As Integer
    Sheet = "Energy_LS_Bottler"
    Period = "2016 WAVE 1 - 3 W/E 05/07"
    LastRow = LastRowFunc(Sheet) 'Calculate last row for which data exists
    LastCol = LastColFunc(Sheet) 'Calculate last column for which data exists

    'copy all of the relevant cells to local arrays for speed
    Dim Locations(), Markets(), data()
    Markets = ActiveWorkbook.Sheets("Markets").Range("A5:A92").Value
    Locations = ActiveWorkbook.Sheets("Locations").Range("D7:D28").Value
    '(pretty sure the following line needs to localize the Cells() to .Cells())
    data = ActiveWorkbook.Sheets(Sheet).Range(Cells(1, 1), Cells(LastRow, LastCol)).Value    '**'

    ReDim Arr(1 To UBound(Markets, 1) * UBound(Locations, 1), 16)

    'make an index of pointers into our accumulation array
    Dim counts As New Collection
    Dim i As Long, l As Long, m As Long
    For l = 1 To UBound(Locations, 1)
        Location = Locations(l, 1)      '**'
        For m = 1 To UBound(Markets, 1)
            Market = Markets(m, 1)      '**'
            i = i + 1
            counts.Add i, CStr(Location) & "~" & CStr(Market)
            'counts.Add NewAccumArray(Location, Market, Period), CStr(Location) & "~" & CStr(Market)
            Arr(i, 1) = Market
            Arr(i, 2) = "AO"
            Arr(i, 3) = Location
            Arr(i, 4) = Period
        Next
    Next

    ' go through each row and add it to the appropiate count in the array
    Dim r As Long
    Dim key As String, idx As Long
    For r = 1 To UBound(data, 1)

        key = CStr(data(r, 3)) & "~" & CStr(data(r, 1))
        If data(r, 17) = "KMF" Then
            idx = counts(key)
            For k = 5 To 16
                    Arr(idx, k) = Arr(idx, k) - data(r, k)
            Next k
        Else
            If data(r, 17) = "Total US" Then
            idx = counts(key)
            For k = 5 To 16
                    Arr(idx, k) = Arr(idx, k) + data(r, k)
            Next k
            End If
        End If

    Next r

    ' output the results
    ActiveWorkbook.Sheets(Sheet).Range(Cells(LastRow + 1, 1), Cells(LastRow + Count, 16)).Value = Arr

    ActiveWorkbook.Sheets(Sheet).AutoFilterMode = False
    Call returnToOriginal
End Sub

Answering the query "What did I mean by this?"

    '(pretty sure the following line needs to localize the Cells() to .Cells())
    data = ActiveWorkbook.Sheets(Sheet).Range(Cells(1, 1), Cells(LastRow, LastCol)).Value    '**'

The use of Cells(..) here is fundamentally unreliable and broken. this is because Cells(..) is really a shortcut for ActiveSheet.Cells(..) and the Active* properties are inherently slow and unreliable because they can change while the code is running. Worse, this code is assuming that ActiveSheet = Energy_LS_Blotter which is far from certain.

The correct way to write this line would be like this:

data = ActiveWorkbook.Sheets(Sheet).Range( _
            ActiveWorkbook.Sheets(Sheet).Cells(1, 1), _
            ActiveWorkbook.Sheets(Sheet).Cells(LastRow, LastCol) _
            ).Value

But that is long, ugly and inconvenient. An easier way would be to use either a Sheet variable, or a With:

With ActiveWorkbook.Sheets(Sheet)
    data = .Range(.Cells(1, 1), .Cells(LastRow, LastCol)).Value
End With
Sign up to request clarification or add additional context in comments.

12 Comments

Thank you for the help. I am currently trying to run the code and have a few questions. I am not sure what you mean by "line needs to localize the Cells() to .Cells())". When I try to run your code for this line, I get a type mismatch but it runs fine if I add a .Value() to it. Additionally, for some reason, I am not being able to access Locations(l). This always gives me subscript out of range but works with a for each loop. Am I missing something here?
@HarshVardhanBansal Yep, couple of bugs there. That line does need the .Value property, and Locations should be accessed as Locations(l, 1) (same with Markets). I will correct these in the code.
Allow me to once again express my gratitude. I ran the code and it now compiles. The only problem is that it is only printing out 2 lines. I think that this may be because you have Arr(idx, k). idx holds the number of instances at which we have the same location and market. So the array will have a lot of blanks in between. Also, everything is shifted by 1 column to the right but I believe that I can fix it.
Thank you everything works now. The only problem with pasting the array was that the upper limit hadn't been specified. I have edited that in your code. Thanks
Running the code on all the worksheets(~30) took about 10 minutes but that is also because the code performs other tasks. So I would say that just this part takes about 6-7 minutes
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.