[sql] How to use DISTINCT and ORDER BY in same SELECT statement?

After executing the following statement:

SELECT  Category  FROM MonitoringJob ORDER BY CreationDate DESC

I am getting the following values from the database:

test3
test3
bildung
test4
test3
test2
test1

but I want the duplicates removed, like this:

bildung
test4
test3
test2
test1

I tried to use DISTINCT but it doesn't work with ORDER BY in one statement. Please help.

Important:

  1. I tried it with:

    SELECT DISTINCT Category FROM MonitoringJob ORDER BY CreationDate DESC
    

    it doesn't work.

  2. Order by CreationDate is very important.

This question is related to sql sql-order-by distinct

The answer is


If the output of MAX(CreationDate) is not wanted - like in the example of the original question - the only answer is the second statement of Prashant Gupta's answer:

SELECT [Category] FROM [MonitoringJob] 
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

Explanation: you can't use the ORDER BY clause in an inline function, so the statement in the answer of Prutswonder is not useable in this case, you can't put an outer select around it and discard the MAX(CreationDate) part.


if object_id ('tempdb..#tempreport') is not null
begin  
drop table #tempreport
end 
create table #tempreport (
Category  nvarchar(510),
CreationDate smallint )
insert into #tempreport 
select distinct Category from MonitoringJob (nolock) 
select * from #tempreport  ORDER BY CreationDate DESC

Extended sort key columns

The reason why what you want to do doesn't work is because of the logical order of operations in SQL, which, for your first query, is (simplified):

  • FROM MonitoringJob
  • SELECT Category, CreationDate i.e. add a so called extended sort key column
  • ORDER BY CreationDate DESC
  • SELECT Category i.e. remove the extended sort key column again from the result.

So, thanks to the SQL standard extended sort key column feature, it is totally possible to order by something that is not in the SELECT clause, because it is being temporarily added to it behind the scenes.

So, why doesn't this work with DISTINCT?

If we add the DISTINCT operation, it would be added between SELECT and ORDER BY:

  • FROM MonitoringJob
  • SELECT Category, CreationDate
  • DISTINCT
  • ORDER BY CreationDate DESC
  • SELECT Category

But now, with the extended sort key column CreationDate, the semantics of the DISTINCT operation has been changed, so the result will no longer be the same. This is not what we want, so both the SQL standard, and all reasonable databases forbid this usage.

Workarounds

It can be emulated with standard syntax as follows

SELECT Category
FROM (
  SELECT Category, MAX(CreationDate) AS CreationDate
  FROM MonitoringJob
  GROUP BY Category
) t
ORDER BY CreationDate DESC

Or, just simply (in this case), as shown also by Prutswonder

SELECT Category, MAX(CreationDate) AS CreationDate
FROM MonitoringJob
GROUP BY Category
ORDER BY CreationDate DESC

I have blogged about SQL DISTINCT and ORDER BY more in detail here.


It can be done using inner query Like this

$query = "SELECT * 
            FROM (SELECT Category  
                FROM currency_rates                 
                ORDER BY id DESC) as rows               
            GROUP BY currency";

Try next, but it's not useful for huge data...

SELECT DISTINCT Cat FROM (
  SELECT Category as Cat FROM MonitoringJob ORDER BY CreationDate DESC
);

SELECT DISTINCT Category FROM MonitoringJob ORDER BY Category ASC

2) Order by CreationDate is very important

The original results indicated that "test3" had multiple results...

It's very easy to start using MAX all the time to remove duplicates in Group By's... and forget or ignore what the underlying question is...

The OP presumably realised that using MAX was giving him the last "created" and using MIN would give the first "created"...


You can use CTE:

WITH DistinctMonitoringJob AS (
    SELECT DISTINCT Category Distinct_Category FROM MonitoringJob 
)

SELECT Distinct_Category 
FROM DistinctMonitoringJob 
ORDER BY Distinct_Category DESC

Just use this code, If you want values of [Category] and [CreationDate] columns

SELECT [Category], MAX([CreationDate]) FROM [MonitoringJob] 
             GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

Or use this code, If you want only values of [Category] column.

SELECT [Category] FROM [MonitoringJob] 
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

You'll have all the distinct records what ever you want.


By subquery, it should work:

    SELECT distinct(Category) from MonitoringJob  where Category in(select Category from MonitoringJob order by CreationDate desc);

Distinct will sort records in ascending order. If you want to sort in desc order use:

SELECT DISTINCT Category
FROM MonitoringJob
ORDER BY Category DESC

If you want to sort records based on CreationDate field then this field must be in the select statement:

SELECT DISTINCT Category, creationDate
FROM MonitoringJob
ORDER BY CreationDate DESC

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-order-by

Laravel Eloquent: Ordering results of all() SQL for ordering by number - 1,2,3,4 etc instead of 1,10,11,12 SQL ORDER BY multiple columns How to use Oracle ORDER BY and ROWNUM correctly? MySQL order by before group by Ordering by specific field value first MySQL ORDER BY multiple column ASC and DESC How can I get just the first row in a result set AFTER ordering? SQL order string as number Order by multiple columns with Doctrine

Examples related to distinct

Using DISTINCT along with GROUP BY in SQL Server How to "select distinct" across multiple data frame columns in pandas? Laravel Eloquent - distinct() and count() not working properly together SQL - select distinct only on one column SQL: Group by minimum value in one field while selecting distinct rows Count distinct value pairs in multiple columns in SQL sql query distinct with Row_Number Eliminating duplicate values based on only one column of the table MongoDB distinct aggregation Pandas count(distinct) equivalent