[excel] Simple Pivot Table to Count Unique Values

This seems like a simple Pivot Table to learn with. I would like to do a count of unique values for a particular value I'm grouping on.

For instance, I have this:

ABC   123
ABC   123
ABC   123
DEF   456
DEF   567
DEF   456
DEF   456

What I want is a pivot table that shows me this:

ABC   1
DEF   2

The simple pivot table that I create just gives me this (a count of how many rows):

ABC   3
DEF   4  

But I want the number of unique values instead.

What I'm really trying to do is find out which values in the first column don't have the same value in the second column for all rows. In other words, "ABC" is "good", "DEF" is "bad"

I'm sure there is an easier way to do it but thought I'd give pivot table a try...

This question is related to excel excel-formula pivot-table

The answer is


Excel 2013 can do Count distinct in pivots. If no access to 2013, and it's a smaller amount of data, I make two copies of the raw data, and in copy b, select both columns and remove duplicates. Then make the pivot and count your column b.


It is not necessary for the table to be sorted for the following formula to return a 1 for each unique value present.

assuming the table range for the data presented in the question is A1:B7 enter the following formula in Cell C1:

=IF(COUNTIF($B$1:$B1,B1)>1,0,COUNTIF($B$1:$B1,B1))

Copy that formula to all rows and the last row will contain:

=IF(COUNTIF($B$1:$B7,B7)>1,0,COUNTIF($B$1:$B7,B7))

This results in a 1 being returned the first time a record is found and 0 for all times afterwards.

Simply sum the column in your pivot table


I usually sort the data by the field I need to do the distinct count of then use IF(A2=A1,0,1); you get then get a 1 in the top row of each group of IDs. Simple and doesn't take any time to calculate on large datasets.


I'd like to throw an additional option into the mix that doesn't require a formula but might be helpful if you need to count unique values within the set across two different columns. Using the original example, I didn't have:

ABC   123  
ABC   123  
ABC   123   
DEF   456  
DEF   567  
DEF   456  
DEF   456

and want it to appear as:

ABC   1  
DEF   2

But something more like:

ABC   123  
ABC   123  
ABC   123  
ABC   456  
DEF   123  
DEF   456  
DEF   567  
DEF   456  
DEF   456

and wanted it to appear as:

ABC  
   123    3  
   456    1  
DEF  
   123    1  
   456    3  
   567    1

I found the best way to get my data into this format and then be able to manipulate it further was to use the following:

enter image description here

Once you select 'Running total in' then choose the header for the secondary data set (in this case it would be the header or column title of the data set that includes 123, 456 and 567). This will give you a max value with the total count of items in that set, within your primary data set.

I then copied this data, pasted it as values, then put it in another pivot table to manipulate it more easily.

FYI, I had about a quarter million rows of data so this worked a lot better than some of the formula approaches, especially ones that try to compare across two columns/data sets because it kept crashing the application.


If you have the data sorted.. i suggest using the following formula

=IF(OR(A2<>A3,B2<>B3),1,0)

This is faster as it uses less cells to calculate.


You can use for helper column also VLOOKUP. I tested and looks little bit faster than COUNTIF.

If you are using header and data are starting in cell A2, then in any cell in row use this formula and copy in all other cells in the same column:

=IFERROR(IF(VLOOKUP(A2;$A$1:A1;1;0)=A2;0;1);1)

Step 1. Add a column

Step 2. Use the formula =IF(COUNTIF(C2:$C$2410,C2)>1,0,1) in 1st record

Step 3. Drag it to all the records

Step 4. Filter '1' in the column with formula


I found an easier way of doing this. Referring to Siddarth Rout's example, if I want to count unique values in column A:

  • add a new column C and fill C2 with formula "=1/COUNTIF($A:$A,A2)"
  • drag formula down to the rest of the column
  • pivot with column A as row label, and Sum{column C) in values to get the number of unique values in column A

My approach to this problem was a little different than what I see here, so I'll share.

  1. (Make a copy of your data first)
  2. Concatenate the columns
  3. Remove duplicates on the concatenated column
  4. Last - pivot on the resulting set

Note: I would like to include images to make this even easier to understand but cant because this is my first post ;)


UPDATE: You can do this now automatically with Excel 2013. I've created this as a new answer because my previous answer actually solves a slightly different problem.

If you have that version, then select your data to create a pivot table, and when you create your table, make sure the option 'Add this data to the Data Model' tickbox is check (see below).

Tick the box next to 'Add this data to the Data Model'

Then, when your pivot table opens, create your rows, columns and values normally. Then click the field you want to calculate the distinct count of and edit the Field Value Settings: Edit field value settings

Finally, scroll down to the very last option and choose 'Distinct Count.' Choose the option 'Distinct Count'

This should update your pivot table values to show the data you're looking for.


I found the easiest approach is to use the Distinct Count option under Value Field Settings (left click the field in the Values pane). The option for Distinct Count is at the very bottom of the list.

Location of where to click

Here are the before (TOP; normal Count) and after (BOTTOM; Distinct Count)

COUNT

DISTINCT COUNT


Siddharth's answer is terrific.

However, this technique can hit trouble when working with a large set of data (my computer froze up on 50,000 rows). Some less processor-intensive methods:

Single uniqueness check

  1. Sort by the two columns (A, B in this example)
  2. Use a formula that looks at less data

    =IF(SUMPRODUCT(($A2:$A3=A2)*($B2:$B3=B2))>1,0,1) 
    

Multiple uniqueness checks

If you need to check uniqueness in different columns, you can't rely on two sorts.

Instead,

  1. Sort single column (A)
  2. Add formula covering the maximum number of records for each grouping. If ABC might have 50 rows, the formula will be

    =IF(SUMPRODUCT(($A2:$A49=A2)*($B2:$B49=B2))>1,0,1)
    

You can make an additional column to store the uniqueness, then sum that up in your pivot table.

What I mean is, cell C1 should always be 1. Cell C2 should contain the formula =IF(COUNTIF($A$1:$A1,$A2)*COUNTIF($B$1:$B1,$B2)>0,0,1). Copy this formula down so cell C3 would contain =IF(COUNTIF($A$1:$A2,$A3)*COUNTIF($B$1:$B2,$B3)>0,0,1) and so on.

If you have a header cell, you'll want to move these all down a row and your C3 formula should be =IF(COUNTIF($A$2:$A2,$A3)*COUNTIF($B$2:$B2,$B3)>0,0,1).


You can use COUNTIFS for multiple criteria,

=1/COUNTIFS(A:A,A2,B:B,B2) and then drag down. You can put as many criteria as you want in there, but it tends to take a lot of time to process.


See Debra Dalgleish's Count Unique Items

enter image description here


Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to excel-formula

Excel doesn't update value unless I hit Enter Referencing value in a closed Excel workbook using INDIRECT? Conditionally formatting cells if their value equals any value of another column Sum values from multiple rows using vlookup or index/match functions What does an exclamation mark before a cell reference mean? If two cells match, return value from third Excel - programm cells to change colour based on another cell Format numbers in thousands (K) in Excel Excel Formula which places date/time in cell when data is entered in another cell in the same row Excel formula to display ONLY month and year?

Examples related to pivot-table

Use Excel pivot table as data source for another Pivot Table PivotTable's Report Filter using "greater than" How to SUM parts of a column which have same text value in different column in the same row Use formula in custom calculated field in Pivot Table refresh both the External data source and pivot tables together within a time schedule Convert Rows to columns using 'Pivot' in SQL Server Ordering issue with date values when creating pivot tables Python Pandas : pivot table with aggfunc = count unique distinct PivotTable to show values, not sum of values Simple Pivot Table to Count Unique Values