0

I have a column of data with unique strings where the first 4 characters in the string may be a repeat of the first 4 characters in another string, in a format similar to:

ABCDEF  
ABCDXY 
ABCDKL
DTYTZF 
DTYTSD

I am attempting to loop through this data to identify which 4 starting characters appear more then three times. If the first 4 digits of the string occur 3 times or more, I would like to remove these from the array entirely, and end up with an array that excludes these values. For example, in my column above, as 3 strings or more begin with 'ABCD', I would like to remove all strings that begin with this code, and have only every other value remain, such that my result would be:

DTYTZF 
DTYTSD

I am currently looping through the array, pushing any value that occurs three times or more into a NEW array, and plan to then use that list to do a second pass on the original array, and remove any matches. This may not be the most efficient way, but I've not been able to determine a better way that is guaranteed not to mess my data up.

I have worked through looping through the strings to identify which strings occur more then once, but when I try to push them to an array, the string successfully is pushed to the array, but is then replaced with the next value as soon as it is pushed to the array. I know the value is pushed correctly, because if I view the array immediately afterwards, I see the value in the array. When the next value is pushed and you view the array again, only the new value is displayed (The older ones are not).

I believe this is due to my limited understanding of ReDim-ing arrays, and me not fully understanding a code snippet for pushing this value into an array. My (condensed) code is as follows:

Sub pickupValues()
    Dim valuesArray()
    Dim i As Long
    Dim y As Long
    Dim sizeCheck As Long
    Dim tempArray() As String

    valuesArray() = Worksheets("Sheet1").Range("A1:A10").Value

    For i = LBound(valuesArray) To UBound(valuesArray)
        sizeCheck = 0
        For y = LBound(valuesArray) To UBound(valuesArray)
            If Left(valuesArray(i, 1), 4) = Left(valuesArray(y, 1), 4) Then
                sizeCheck = sizeCheck + 1
                i = y
                If sizeCheck >= 3 Then
                    ReDim tempArray(1 To 1) As String 'I'm not sure why I need to do this. 
                    tempArray(UBound(tempArray)) = Left(valuesArray(i, 1), 4) 'I believe this is what pushes the value into the array. 
                    ReDim Preserve tempArray(1 To UBound(tempArray) + 1) As String 'Again unsure on what the purpose of this is. 
                    viewArray (tempArray) 
                End If
            End If
        Next y
    Next i

End Sub


Function viewArray(myArray)
    Dim txt As String
    Dim i As Long

    For i = LBound(myArray) To UBound(myArray)
    txt = txt + myArray(i) + vbCrLf
    Next i

    MsgBox txt
End Function

What am I doing wrong?

I would like to re-use the same basic code later in the function to push other values OUT of an array based on if they match the string or not, but it seems VBA does not like to move values out of arrays either. Is there an easy solution that would match both scenarios?

2
  • what is your desired output and what are you going to do with that information? Commented Sep 22, 2017 at 22:22
  • I have updated my post with further clarity on the end goal. Commented Sep 22, 2017 at 22:31

2 Answers 2

2

I've rewritten what you are trying to do. I'm using the filter function to quickly get your results in the array

Option Explicit
Public Sub pickupValues()
    Dim tmp As Variant
    Dim results As Variant
    Dim i As Long
    Dim v

    ' Make sure this matches your range
    With ThisWorkbook.Sheets("Sheet1")
        ' Important to transpose the input here as Filter will only take a 1D array. Even though it's only 1 column, setting an array this way will generate a 2D array
        tmp = Application.Transpose(.Range(.Cells(1, 1), .Cells(.Cells(.Rows.Count, 1).End(xlUp).Row, 1)).Value2)
    End With

    ' ReDiming to the maximum value and slimming down afterwards is much quicker then increasing your array each time you've found a new value
    ReDim results(1 To UBound(tmp))
    For Each v In tmp
        ' Less then 2 as first result is '0'. Will return '-1' if can't be found but as test criteria is in the array it will always be at least 0
        If UBound(Filter(tmp, Left(v, 4))) < 2 Then
            i = i + 1
            results(i) = v
        End If
    Next v
    ' Redim Preserve down to actual array size
    If i > 0 Then
        ReDim Preserve results(1 To i)
        viewArray (results)
    Else
        MsgBox "Nothing Found"
    End If
End Sub
' Should really be a sub as doesn't return anything back to caller
Public Sub viewArray(myArray)
    MsgBox Join(myArray, vbCrLf)
End Sub
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you this has provided some valuable insight into ReDimming. For some reason this code throws a subscript out of range error when using a set of values greater then 5 (The number in my original example), Can you provide an explanation as to why? It looks dynamic to me, so I think I'm missing a crucial part of understanding the solution.
Which line does it throw the error on? In my tests this works fine for more then 5 rows (As you say it is dynamic) The only thing I haven't handled is if there aren't any results returned and then you'll get a Subscript out of range error on the line ReDim Preserve results(1 to i-1). I've updated my answer to handle this. Also, this is build for data in Column A, so make sure your source data is in there (or update the range)
Ah, the updated function that advises 'Nothing found' solves the issue. I believe what's happened is that only the first x values in the string are non-unique (so my sample strings only account for the first four digits being duplicated, after that the remainder of the string IS unique. Now that I have that and a working sample, I'm sure I can reverse engineer what I need once I fully understand the sample, so I've marked the answer as accepted. Thanks!
Ok great - if you have any questions around how the above works feel free to ask
0

Your algorithm is not helping you.

Option 1: Sort your array. Then you can make a single pass to find sequential values with the same first four characters and count them.

Option 2: Use a Dictionary object: first four characters as key, number of occurrences as value.

2 Comments

I'm not sure either of these help me. My array is sorted, but I don't understand how that solves my issue. I'm also not clear on why inserting these values as a dictionary object would improve the functionality either. Is this easier in VBA? Can you provide a quick sample?
You, uh, marked your question as using VBA. And if you are already sorted, then all you have to do is loop through your array and keep count of items that have the same initial value as the last one. If it is more than three when you find an item not like the last one...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.