[sql] SELECT DISTINCT on one column

Using SQL Server, I have...

ID  SKU     PRODUCT
=======================
1   FOO-23  Orange
2   BAR-23  Orange
3   FOO-24  Apple
4   FOO-25  Orange

I want

1   FOO-23  Orange
3   FOO-24  Apple

This query isn't getting me there. How can I SELECT DISTINCT on just one column?

SELECT 
[ID],[SKU],[PRODUCT]
FROM [TestData] 
WHERE ([PRODUCT] = 
(SELECT DISTINCT [PRODUCT] FROM [TestData] WHERE ([SKU] LIKE 'FOO-%')) 
ORDER BY [ID]

This question is related to sql sql-server tsql distinct

The answer is


Assuming that you're on SQL Server 2005 or greater, you can use a CTE with ROW_NUMBER():

SELECT  *
FROM    (SELECT ID, SKU, Product,
                ROW_NUMBER() OVER (PARTITION BY PRODUCT ORDER BY ID) AS RowNumber
         FROM   MyTable
         WHERE  SKU LIKE 'FOO%') AS a
WHERE   a.RowNumber = 1

Try this:

SELECT * FROM [TestData] WHERE Id IN(SELECT DISTINCT MIN(Id) FROM [TestData] GROUP BY Product)   


I know it was asked over 6 years ago, but knowledge is still knowledge. This is different solution than all above, as I had to run it under SQL Server 2000:

DECLARE @TestData TABLE([ID] int, [SKU] char(6), [Product] varchar(15))
INSERT INTO @TestData values (1 ,'FOO-23', 'Orange')
INSERT INTO @TestData values (2 ,'BAR-23', 'Orange')
INSERT INTO @TestData values (3 ,'FOO-24', 'Apple')
INSERT INTO @TestData values (4 ,'FOO-25', 'Orange')

SELECT DISTINCT  [ID] = ( SELECT TOP 1 [ID]  FROM @TestData Y WHERE Y.[Product] = X.[Product])
                ,[SKU]= ( SELECT TOP 1 [SKU] FROM @TestData Y WHERE Y.[Product] = X.[Product])
                ,[PRODUCT] 
            FROM @TestData X  

SELECT min (id) AS 'ID', min(sku) AS 'SKU', Product
    FROM TestData
    WHERE sku LIKE 'FOO%' -- If you want only the sku that matchs with FOO%
    GROUP BY product 
    ORDER BY 'ID'

Here is a version, basically the same as a couple of the other answers, but that you can copy paste into your SQL server Management Studio to test, (and without generating any unwanted tables), thanks to some inline values.

WITH [TestData]([ID],[SKU],[PRODUCT]) AS
(
    SELECT *
    FROM (
        VALUES
        (1,   'FOO-23',  'Orange'),
        (2,   'BAR-23',  'Orange'),
        (3,   'FOO-24',  'Apple'),
        (4,   'FOO-25',  'Orange')
    )
    AS [TestData]([ID],[SKU],[PRODUCT])
)

SELECT * FROM [TestData] WHERE [ID] IN 
(
    SELECT MIN([ID]) 
    FROM [TestData] 
    GROUP BY [PRODUCT]
)

Result

ID  SKU     PRODUCT
1   FOO-23  Orange
3   FOO-24  Apple

I have ignored the following ...

WHERE ([SKU] LIKE 'FOO-%')

as its only part of the authors faulty code and not part of the question. It's unlikely to be helpful to people looking here.


The simplest solution would be to use a subquery for finding the minimum ID matching your query. In the subquery you use GROUP BY instead of DISTINCT:

SELECT * FROM [TestData] WHERE [ID] IN (
   SELECT MIN([ID]) FROM [TestData]
   WHERE [SKU] LIKE 'FOO-%'
   GROUP BY [PRODUCT]
)

try this:

SELECT 
    t.*
    FROM TestData t
        INNER JOIN (SELECT
                        MIN(ID) as MinID
                        FROM TestData
                        WHERE SKU LIKE 'FOO-%'
                   ) dt ON t.ID=dt.MinID

EDIT
once the OP corrected his samle output (previously had only ONE result row, now has all shown), this is the correct query:

declare @TestData table (ID int, sku char(6), product varchar(15))
insert into @TestData values (1 ,  'FOO-23'      ,'Orange')
insert into @TestData values (2 ,  'BAR-23'      ,'Orange')
insert into @TestData values (3 ,  'FOO-24'      ,'Apple')
insert into @TestData values (4 ,  'FOO-25'      ,'Orange')

--basically the same as @Aaron Alton's answer:
SELECT
    dt.ID, dt.SKU, dt.Product
    FROM (SELECT
              ID, SKU, Product, ROW_NUMBER() OVER (PARTITION BY PRODUCT ORDER BY ID) AS RowID
              FROM @TestData
              WHERE  SKU LIKE 'FOO-%'
         ) AS dt
    WHERE dt.RowID=1
    ORDER BY dt.ID

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to distinct

Using DISTINCT along with GROUP BY in SQL Server How to "select distinct" across multiple data frame columns in pandas? Laravel Eloquent - distinct() and count() not working properly together SQL - select distinct only on one column SQL: Group by minimum value in one field while selecting distinct rows Count distinct value pairs in multiple columns in SQL sql query distinct with Row_Number Eliminating duplicate values based on only one column of the table MongoDB distinct aggregation Pandas count(distinct) equivalent