0

I'm trying to implement a SQL Group By statement in VBA. I have a simple worksheet with 4 columns (login_year, login_month, login_day, user_name) and I want to produce a count of unique users who logged in over each month.

So the Oracle SQL would be:

SELECT 
login_year, 
login_month, 
COUNT(DISTINCT user_name) 
FROM users
GROUP BY 
login_year, 
login_month;

How would it be best to implement this using VBA?

Many thanks for your help.

3
  • Have you tried using a pivot table? Commented Jul 11, 2016 at 14:15
  • unsure on how to access this but you will need to name that count in order to bring it through - something like: count(Distinct user_name) as usercount Commented Jul 11, 2016 at 14:17
  • I'm using Excel 2010 and i dont think this has the Distinct Count option in pivot tables. Commented Jul 11, 2016 at 14:29

2 Answers 2

0

While the Jet/ACE SQL Engine (used in MS Office/Windows programs including Access, Excel) dialect does not maintain COUNT(DISTINCT ...) compared to other RDMS, you can use a derived table to first return unique users by month and year and then count them in the outer query.

SQL (embedded as string below)

SELECT u.login_year, u.login_month, COUNT(*) As usercount
FROM 
   (SELECT user_name, login_year, login_month 
    FROM users
    GROUP BY user_name, login_year, login_month) As u
GROUP BY u.login_year, u.login_month;

VBA (ADO Driver/Provider connection)

Sub RunSQL()    
    Dim conn As Object, rst As Object
    Dim strConnection As String, strSQL As String
    Dim i As Integer

    Set conn = CreateObject("ADODB.Connection")
    Set rst = CreateObject("ADODB.Recordset")

    ' CONNECTION STRINGS (TWO VERSIONS)
'    strConnection = "DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
'                      & "DBQ=C:\Path\To\Workbook.xlsm;"
    strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
                       & "Data Source='C:\Path\To\Workbook.xlsm';" _
                       & "Extended Properties=""Excel 8.0;HDR=YES;"";"

    strSQL = " SELECT u.login_year, u.login_month, COUNT(*) As userCount " _
                & " FROM " _
                & "   (SELECT user_name, login_year, login_month " _
                & "    FROM users" _
                & "    GROUP BY user_name, login_year, login_month) As u" _
                & " GROUP BY u.login_year, u.login_month;"                    
    ' OPEN DB CONNECTION
    conn.Open strConnection
    rst.Open strSQL, conn

    ' COLUMN HEADERS
    For i = 1 To rst.Fields.Count
        Worksheets("RESULTS").Cells(1, i) = rst.Fields(i - 1).Name
    Next i        
    ' DATA ROWS
    Worksheets("RESULTS").Range("A2").CopyFromRecordset rst

    rst.Close: conn.Close    
    Set rst = Nothing: Set conn = Nothing    
End Sub
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Parfait this is what I was looking for! Can you recommend a book or blog to give me some background?
Google is your friend. See this Microsoft tutorial on workbook queries.
0

You need to set a reference to the MS Scripting Library to have access to the Dictionary object.

The below code creates a dictionary entry for each year/month. Each entry then holds another dictionary for the names. The count of the child dictionary is the distinct count.

Sub ListDistinctUserCount()

    Dim vaValues As Variant
    Dim dc As Scripting.Dictionary
    Dim dcNames As Scripting.Dictionary
    Dim i As Long
    Dim sAllData As String

    'read range into array
    vaValues = Sheet1.Range("A2:D51").Value
    Set dc = New Scripting.Dictionary

    For i = LBound(vaValues, 1) To UBound(vaValues, 1)
        'concat the year and month for dictionary key
        sAllData = vaValues(i, 1) & "|" & vaValues(i, 2)

        'if it's already added
        If dc.Exists(sAllData) Then
            'add the name if it doesn't exists
            If Not dc.Item(sAllData).Exists(vaValues(i, 4)) Then
                dc.Item(sAllData).Add vaValues(i, 4), vaValues(i, 4)
            End If
        Else
            'otherwise add the year month and a new dict for the names
            Set dcNames = New Scripting.Dictionary
            dcNames.Add vaValues(i, 4), vaValues(i, 4)
            dc.Add sAllData, dcNames
        End If
    Next i

    For i = 0 To dc.Count - 1
        Debug.Print dc.Keys(i), dc.Items(i).Count
    Next i

End Sub

Not very elegant, but I couldn't think of a better way.

Additional Information for VBA Novices

A Variant data type can hold any other data type. It's large and the least efficient, but is necessary in some cases.

The Range.Value property of a multi-cell range returns a two-dimensional, one-based array. You can assign that array to a Variant and loop through it. This is a lot faster than reading individual cells. Reading and writing to the Excel grid are notoriously slow, so it's a common practice to read a whole range into an array and work with the data that way.

VBA has a Collection object for storing data. The Scripting.Dictionary object has a few features that make it superior such as the .Exists property and writing all the keys or items out as an array. The Scripting library is installed with Office by default, but is not included in new projects by default. You have to go to Tools - References ton include it.

Dictionaries are key-item lists, like you find in many other languages. The item can be almost any data type. In this case, the item is another dictionary.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.