[excel] Quicker way to get all unique values of a column in VBA?

Is there a faster way to do this?

Set data = ws.UsedRange

Set unique = CreateObject("Scripting.Dictionary")

On Error Resume Next
For x = 1 To data.Rows.Count
    unique.Add data(x, some_column_number).Value, 1
Next x
On Error GoTo 0

At this point unique.keys gets what I need, but the loop itself seems to be very slow for files that have tens of thousands of records (whereas this wouldn't be a problem at all in a language like Python or C++ especially).

This question is related to excel vba

The answer is


PowerShell is a very powerful and efficient tool. This is cheating a little, but shelling PowerShell via VBA opens up lots of options

The bulk of the code below is simply to save the current sheet as a csv file. The output is another csv file with just the unique values

Sub AnotherWay()
Dim strPath As String
Dim strPath2 As String

Application.DisplayAlerts = False
strPath = "C:\Temp\test.csv"
strPath2 = "C:\Temp\testout.csv"
ActiveWorkbook.SaveAs strPath, xlCSV
x = Shell("powershell.exe $csv = import-csv -Path """ & strPath & """ -Header A | Select-Object -Unique A | Export-Csv """ & strPath2 & """ -NoTypeInformation", 0)
Application.DisplayAlerts = True

End Sub

Use Excel's AdvancedFilter function to do this.

Using Excels inbuilt C++ is the fastest way with smaller datasets, using the dictionary is faster for larger datasets. For example:

Copy values in Column A and insert the unique values in column B:

Range("A1:A6").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("B1"), Unique:=True

It works with multiple columns too:

Range("A1:B4").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("D1:E1"), Unique:=True

Be careful with multiple columns as it doesn't always work as expected. In those cases I resort to removing duplicates which works by choosing a selection of columns to base uniqueness. Ref: MSDN - Find and remove duplicates

enter image description here

Here I remove duplicate columns based on the third column:

Range("A1:C4").RemoveDuplicates Columns:=3, Header:=xlNo

Here I remove duplicate columns based on the second and third column:

Range("A1:C4").RemoveDuplicates Columns:=Array(2, 3), Header:=xlNo

Try this

Option Explicit

Sub UniqueValues()
Dim ws As Worksheet
Dim uniqueRng As Range
Dim myCol As Long

myCol = 5 '<== set it as per your needs
Set ws = ThisWorkbook.Worksheets("unique") '<== set it as per your needs

Set uniqueRng = GetUniqueValues(ws, myCol)

End Sub


Function GetUniqueValues(ws As Worksheet, col As Long) As Range
Dim firstRow As Long

With ws
    .Columns(col).RemoveDuplicates Columns:=Array(1), header:=xlNo

    firstRow = 1
    If IsEmpty(.Cells(1, col)) Then firstRow = .Cells(1, col).End(xlDown).row

    Set GetUniqueValues = Range(.Cells(firstRow, col), .Cells(.Rows.Count, col).End(xlUp))
End With

End Function

it should be quite fast and without the drawback NeepNeepNeep told about