1

Currently I have 10Kx15 Rows worth of raw data imported in an excel spreadsheet. I have a number of fields that are cleansed but the one of interest is a field called "Hazard". For every instance of Hazard encountered, we need to strip this out.

This is the code I use to cleanse (partially) my data set:

Sub dataCleanse()
Dim Last

Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

Last = Cells(Rows.Count, "F").End(xlUp).Row
For i = Last To 1 Step -1
    If (Cells(i, "F").Value) = "Hazard" Then
        Cells(i, "A").EntireRow.Delete
    End If
Next i

Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True

End Sub

To process 10,000 records or so it takes 10-15 seconds. I have experimented with using auto-filter, but when I use .EntireRow.Delete it strips out the rows underneath the filtered criteria. i.e. If we have rows 1 and 3 with 'Hazard' and use auto-filter, it will also delete row2 which does not have 'Hazard'.

I have also set the calculation to Manual first and then Automatic so it doesn't refresh each time.

Are there any suggestions that could be offered to increase the speed of my macro?

Thank you!

6
  • It is really best suited to Auto Filter. If row 2 has "Hazard" then why is it incorrect for this to be deleted? Commented Apr 12, 2016 at 6:10
  • Apologies Brett. I meant if it Row 1 has hazard and Row 3 has Hazard, but row 2 does not have hazard, it will delete all three rows. Commented Apr 12, 2016 at 6:18
  • See this question regarding a solution to the Autofilter method. Commented Apr 12, 2016 at 6:39
  • The best way to improve the speed of processing is to assign your used range to an array, process the array, clear the range, and assign the array back to the range. There are several examples on this site. Chip Pearson's site has a lot of good stuff on array handling. Commented Apr 12, 2016 at 6:40
  • 1
    See here for using an array and deleting rows in one batch: stackoverflow.com/questions/36452158/… Commented Apr 12, 2016 at 7:20

4 Answers 4

2

you could go with the following Autofilter approach

Option Explicit

Sub dataCleanse()

Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Application.DisplayAlerts = False

With ActiveSheet
    ' insert "dummy" header cell for Autofilter to work
    .Range("F1").Insert
    .Range("F1").value = "header"

    With .Range("F1", .Cells(.Rows.Count, "F").End(xlUp))
        .AutoFilter Field:=1, Criteria1:="Hazard"
        With .Offset(1).Resize(.Rows.Count - 1)
            If Application.WorksheetFunction.Subtotal(103, .Columns(1)) > 1 Then .SpecialCells(xlCellTypeVisible).EntireRow.Delete
        End With
        .AutoFilter
    End With

    .Range("F1").Delete 'remove "dummy" header cell

End With

Application.DisplayAlerts = True
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True

End Sub

processing 10,000 records of 250 columns each in much less then a second

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks! Could you please explain what the .SubTotal(103...) part does?
It counts the number of visible cells. So that rows deleting takes place only if there are at last two of them (header cell being always visible). if I fulfilled your question please mark my answer as accepted, thank you
Right, and what does the '103' imply? is it significant? And once I get to implementing your code I'll let you know!
It's an argument that tells Subtotal which one of its many "flavours" will be used to count the range in its last argument. "103" is such an argument to activate the "count visible" "flavour"
0

I am not sure if this will be faster, but my suggestion is to select column F, find an instance of "Hazard", delete that row, and repeat the process until "Hazard" is not found in column F.

Dim iRow As Integer

Application.ScreenUpdating = False
Columns("F:F").Select

Set RangeObj = Selection.Find(What:="Hazard", LookIn:=xlValues, MatchCase:=True)

Do Until (RangeObj Is Nothing)
    iRow = RangeObj.Row
    Rows(iRow & ":" & iRow).Delete
    Columns("F:F").Select
    Set RangeObj = Selection.Find(What:="Hazard", LookIn:=xlValues, MatchCase:=True)
Loop

Application.ScreenUpdating = True

Please give it a try.

Comments

0

This solution is not faster for small datasets, but it will be for very large datasets. The code looks longer, but handling the arrays is much faster than manipulating the workbook. (I am sure there are more efficient ways to shorten the array). BTW - your code worked for me on the example dataset I put together. If this doesn't work on your data, please post a small example of your input and what the result should look like.

Example input:

enter image description here

Output from macro:

enter image description here

Macro code using arrays:

Option Explicit

Sub dataCleanse2()
Dim nRows As Long, nCols As Long
Dim i As Long, j As Long, k As Long
Dim myRng As Range
Dim myArr() As Variant, myTmpArr() As Variant

Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

Set myRng = Sheets("Sheet1").UsedRange
myArr = myRng.Value2

nRows = UBound(myArr, 1)
nCols = UBound(myArr, 2)
For i = nRows To 1 Step -1
    If CStr(myArr(i, 6)) = "Hazard" Then
        ReDim Preserve myTmpArr(1 To nRows - 1, 1 To nCols)
        For j = 1 To i - 1
            For k = 1 To nCols
                myTmpArr(j, k) = myArr(j, k)
            Next k
        Next j
        For j = i To nRows - 1
            For k = 1 To nCols
                myTmpArr(j, k) = myArr(j + 1, k)
            Next k
        Next j
        nRows = UBound(myTmpArr, 1)
        Erase myArr
        myArr = myTmpArr
        Erase myTmpArr
    End If
Next i

myRng.Clear
Set myRng = Sheets("Sheet1").Range(Cells(1, 1), Cells(nRows, nCols))
myRng.Value2 = myArr

Set myRng = Nothing
Erase myArr

Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub

1 Comment

Thanks for this oldUgly! I actually went for another approach, I created an array which I then loaded up all the values into and looped through them all. It's improved the speed by around 50% - 4-5 seconds now rather than 10-11 so it's good enough for now :)
0

Thanks for the help everyone! I went for an alternative approach (apart from the one user has posted), that made use of an array. I'm still familiarizing myself with arrays (newish to VBA / programming in general), but found that when I loaded the values in an array there was an improvement by around 50% in speed! I don't know the exact reason why loading into an array is that much faster, but I'm assuming it is to do with the fact that it processes the array as an aggregate rather than individual cell values.

Sub CleanseAction()
Dim Last
Dim prevCalcMode As Variant

Application.ScreenUpdating = False
prevCalcMode = Application.Calculation
Application.Calculation = xlCalculationManual

Last = Cells(Rows.Count, "H").End(xlUp).Row
For i = Last To 1 Step -1
    If (Cells(i, "H").Value) = "Hazard" Then
           Cells(i, "A").EntireRow.Delete
    End If
Next i

Application.Calculation = prevCalcMode
Application.ScreenUpdating = True

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.