[excel] Ignore Duplicates and Create New List of Unique Values in Excel

I have a column of values that often appear as duplicates. I need to create a new column, of unique values based on the first column, as follows:

Column A   Column B  
a          a
a          b
b          c
c
c

This Column B will actually need to appear on a different sheet, within the same workbook, so I assume it will need to work with the sheet2!A1 style format.

I have not had any luck with the Data/Filter menu options as this only seems to work on command. I need column B to update automatically whenever a new value is entered into column A.

This question is related to excel excel-formula excel-2010

The answer is


Totero's answer is correct. The link is also very helpful.

Basically the formula you need is:

B2=INDEX($A$2:$A$20, MATCH(0, COUNTIF($B$1:B1, $A$2:$A$20), 0))

Then press ctrl+shift+enter (or it will not work using a array formula).

Two important things to keep in mind here: The complete list is in cells A2:A20, then this formula has to be pasted in cell B2 (Not B1 as that will give you circular reference). Secondly this is an array formula, so you need to press ctrl+shift+enter or it will not work correctly.


=SORT(UNIQUE(A:A))

The above formula works best if you want to list unique values in a column.


So for this task First Sort your data in order from A to Z or Z to A then you can just use one simple formula as stated below:

=IF(A2=A3, "Duplicate", "Not Duplicate")

The above formula states that if column A2 data ( A is column and 2 is row number) is similar to A3 (A is Column and 3 is Row number) then it will print Duplicate else will print Not Duplicate.

Lets consider an example, Column A consists Email address in which some are duplicate, so in Column 2, I used the above stated formula which in results displayed me the 2 duplicates cells one is Row 2 and Row 6.

One you got the duplicate data just put filter on your sheet and make visible only the duplicate data and delete all the unnecessary data.


In my case the excel was frozen when using the formula of

B2=INDEX($A$2:$A$20, MATCH(0, COUNTIF($B$1:B1, $A$2:$A$20), 0))

because there was many rows (10000). So I did in another way which I show below.

I have copied my original list to a second column and then with the function of Excel "remove duplicates" I could find the list of unique values.

Copied from Microsoft Office Website:

Select all the rows, including the column headers, in the list 

you want to filter.

Click the top left cell of the range, and then drag to the bottom right cell.

On the Data menu, point to Filter, and then click Advanced Filter.
In the Advanced Filter dialog box, click Filter the list, in place.
Select the Unique records only check box, and then click OK.

The filtered list is displayed and the duplicate rows are hidden.

On the Edit menu, click Office Clipboard.

The Clipboard task pane is displayed.

Make sure the filtered list is still selected, and then click Copy Copy button.

The filtered list is highlighted with bounding outlines and the selection appears as an > > item at the top of the Clipboard.

On the Data menu, point to Filter, and then click Show All.

The original list is re-displayed.

Press the DELETE key.

The original list is deleted.

In the Clipboard, click on the filtered list item.

The filtered list appears in the same location as the original list.

Source: Microsoft Office Website (link removed, cause dead)


There is a good guide of how to do this here.

Basically Something similar to:

=INDEX(Sheet1!$A$1:$A$20, MATCH(0, COUNTIF($B$1:B1,Sheet!$A$1:$A$20), 0))

I have a list of color names in range A2:A8, in column B I want to extract a distinct list of color names.

Follow the below given steps:

  • Select the Cell B2; write the formula to retrieve the unique values from a list.
  • =IF(COUNTIF(A$2:A2,A2)=1,A2,””)
  • Press Enter on your keyboard.
  • The function will return the name of the first color.
  • To return the value for the rest of cells, copy the same formula down. To copy formula in range B3:B8, copy the formula in cell B2 by pressing the key CTRL+C on your keyboard and paste in the range B3:B8 by pressing the key CTRL+V.
  • Here you can see the output where we have the unique list of color names.

To remove duplicates from a column

  1. Sort the values in column A A->Z
  2. Select column B
  3. While column B is still selected, in the formula input box, enter

    =IF(TRIM(A1)=TRIM(A2),"",TRIM(A1))
    
  4. While Column B is still selected, select Edit -> Fill -> Down (in newer versions, simply select cell B1 and pull down the outer box to expand all the way down in the column)

Note: if column B is on another sheet, you may do Sheet1!A1 and Sheet1!A2.


The MODERN approach is to consider cases where column of information come from a web service such as an OData source. If you need to generate a filter select fields off of massive data that has replicated values for the column, consider the code below:

var CatalogURL = getweb(currenturl)
                 +"/_api/web/lists/getbytitle('Site%20Inventory%20and%20Assets')/items?$select=Expense_x005F_x0020_Type&$orderby=Expense_x005F_x0020_Type";

/* the column that is replicated, is ordered by <column_name> */

    OData.read(CatalogURL,
        function(data,request){

            var myhtml ="";
            var myValue ="";

            for(var i = 0; i < data.results.length; i++)
            {
                myValue = data.results[i].Expense_x005F_x0020_Type;

                if(i == 0)
                {
                        myhtml += "<option value='"+myValue+"'>"+myValue+"</option>";
                }
                else
                if(myValue != data.results[i-1].Expense_x005F_x0020_Type)
                {
                        myhtml += "<option value='"+myValue+"'>"+myValue+"</option>";

                }
                else
                {

                }


            }

            $("#mySelect1").append(myhtml);

        });

All you have to do is : Go to Data tab Chose advanced in Sort & Filter In actions select : copy to another location if want a new list - Copy to any location In list range chose the list you want to get the records off . And the most important thing is to check : Unique records only .


Find here mentioned above formula with error control

=IFERROR(INDEX($B$2:$B$9, MATCH(0,COUNTIF($D$1:D1, $B$2:$B$9), 0)),"") 

where: (B2:B9 is the column data which you want to extract the unique values, D1 is the above cell where your formula is located)


On a sorted column, you can also try this idea:

B2=A2
B3=IFERROR(INDEX(A:A,MATCH(B2,A:A,1)+1),"")

B3 can be pasted down. It will result 0, after the last unique match. If this is unwanted, put some IF statement around to exclude this.

Edit:

Easier than an IF statement, at least for text-values:

B3=IFERROR(T(INDEX(A:A,MATCH(B2,A:A,1)+1)),"")

Another approach, since Excel 2016, is to use Power Query.

Howto:

  • select the data (including the field name),
  • use menu Data > From a table or a range,
  • (Excel will change your sheet into an Excel Table, which is very convenient),
  • in the Power Query Editor, right-click on ColumnA (the column header), and Remove duplicates,
  • in the menu, choose Close and load, choose where you want the result, and you're done, like this.
  • Whenever you want the result table to update, right-click it and choose Refresh.

Benefits :

  • it uses the CPU only when manually updated, which is very convenient for long lists,
  • if you're curious, this offers many other powerful options.

Drawbacks :

  • it doesn't update on the fly (you have to right-click and refresh the result table),
  • people with old version of Excel won't be able to refresh the results table.

In the worksheet module for the sheet containing the list:

Private Sub Worksheet_Change(ByVal Target As Range)

    Dim rngDest As Range

    If Not Intersect(Target, Me.Columns(1)) Is Nothing Then

        Set rngDest = ThisWorkbook.Sheets("Sheet2").Range("A1")

        Me.Range(Me.Range("A2"), Me.Cells(Rows.Count, 1).End(xlUp)).AdvancedFilter _
            Action:=xlFilterCopy, CopyToRange:=rngDest, Unique:=True

    End If

End Sub

Honestly I followed these examples to a tee and they simply didn't work. What I ended up doing after struggling pointlessly trying to get Excel to work was to just copy the entire contents of my column to NotePad++ where I was able to find an easy solution within minutes. Take a look at this: Removing duplicate rows in Notepad++

Edit: Here is a brief overview of how to do it in TextFX:

Plugins -> Plugin Manager -> Show Plugin Manager -> Available tab -> TextFX -> Install

After TextFX is installed in NotePad++, then you select all your text you want to remove duplicates from, then make sure to check: TextFX -> TextFX Tools -> Sort outputs only UNIQUE lines

Then click "sort lines case sensitive" or "sort lines case insensitive" and it will perform the unique sort.


Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to excel-formula

Excel doesn't update value unless I hit Enter Referencing value in a closed Excel workbook using INDIRECT? Conditionally formatting cells if their value equals any value of another column Sum values from multiple rows using vlookup or index/match functions What does an exclamation mark before a cell reference mean? If two cells match, return value from third Excel - programm cells to change colour based on another cell Format numbers in thousands (K) in Excel Excel Formula which places date/time in cell when data is entered in another cell in the same row Excel formula to display ONLY month and year?

Examples related to excel-2010

Get list of Excel files in a folder using VBA filter out multiple criteria using excel vba Swap x and y axis without manually swapping values Excel 2010 VBA - Close file No Save without prompt Removing special characters VBA Excel Creating a list/array in excel using VBA to get a list of unique names in a column Excel formula to get week number in month (having Monday) Excel VBA Run-time Error '32809' - Trying to Understand it IF statement: how to leave cell blank if condition is false ("" does not work) Excel how to find values in 1 column exist in the range of values in another