[sql] How to select only the first rows for each unique value of a column?

Let's say I have a table of customer addresses:

+-----------------------+------------------------+
|         CName         |      AddressLine       |
+-----------------------+------------------------+
|  John Smith           |  123 Nowheresville     |
|  Jane Doe             |  456 Evergreen Terrace |
|  John Smith           |  999 Somewhereelse     |
|  Joe Bloggs           |  1 Second Ave          |
+-----------------------+------------------------+

In the table, one customer like John Smith can have multiple addresses. I need the SELECT query for this table to return only first row found where there are duplicates in 'CName'. For this table it should return all rows except the 3rd (or 1st - any of those two addresses are okay but only one can be returned).

Is there a keyword I can add to the SELECT query to filter based on whether the server has already seen the column value before?

This question is related to sql sql-server tsql select unique

The answer is


In SQL 2k5+, you can do something like:

;with cte as (
  select CName, AddressLine,
  rank() over (partition by CName order by AddressLine) as [r]
  from MyTable
)
select CName, AddressLine
from cte
where [r] = 1

You can use the row_numer() over(partition by ...) syntax like so:

select * from
(
select *
, ROW_NUMBER() OVER(PARTITION BY CName ORDER BY AddressLine) AS row
from myTable
) as a
where row = 1

What this does is that it creates a column called row, which is a counter that increments every time it sees the same CName, and indexes those occurrences by AddressLine. By imposing where row = 1, one can select the CName whose AddressLine comes first alphabetically. If the order by was desc, then it would pick the CName whose AddressLine comes last alphabetically.


This will give you one row of each duplicate row. It will also give you the bit-type columns, and it works at least in MS Sql Server.

(select cname, address 
from (
  select cname,address, rn=row_number() over (partition by cname order by cname) 
  from customeraddresses  
) x 
where rn = 1) order by cname

If you want to find all the duplicates instead, just change the rn= 1 to rn > 1. Hope this helps


You can use row_number() to get the row number of the row. It uses the over command - the partition by clause specifies when to restart the numbering and the order by selects what to order the row number on. Even if you added an order by to the end of your query, it would preserve the ordering in the over command when numbering.

select *
from mytable
where row_number() over(partition by Name order by AddressLine) = 1

to get every unique value from your customer table, use

SELECT DISTINCT CName FROM customertable;

more in-depth of w3schools: https://www.w3schools.com/sql/sql_distinct.asp


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to select

Warning: Use the 'defaultValue' or 'value' props on <select> instead of setting 'selected' on <option> SQL query to check if a name begins and ends with a vowel Angular2 *ngFor in select list, set active based on string from object SQL: Two select statements in one query How to get selected value of a dropdown menu in ReactJS DATEDIFF function in Oracle How to filter an array of objects based on values in an inner array with jq? Select unique values with 'select' function in 'dplyr' library how to set select element as readonly ('disabled' doesnt pass select value on server) Trying to use INNER JOIN and GROUP BY SQL with SUM Function, Not Working

Examples related to unique

Count unique values with pandas per groups Find the unique values in a column and then sort them How can I check if the array of objects have duplicate property values? Firebase: how to generate a unique numeric ID for key? pandas unique values multiple columns Select unique values with 'select' function in 'dplyr' library Generate 'n' unique random numbers within a range SQL - select distinct only on one column Can I use VARCHAR as the PRIMARY KEY? Count unique values in a column in Excel