DISTINCT
to remove duplicate GROUPING SETS
from the GROUP BY
clauseIn a completely silly example using GROUPING SETS()
in general (or the special grouping sets ROLLUP()
or CUBE()
in particular), you could use DISTINCT
in order to remove the duplicate values produced by the grouping sets again:
SELECT DISTINCT actors
FROM (VALUES('a'), ('a'), ('b'), ('b')) t(actors)
GROUP BY CUBE(actors, actors)
With DISTINCT
:
actors
------
NULL
a
b
Without DISTINCT
:
actors
------
a
b
NULL
a
b
a
b
But why, apart from making an academic point, would you do that?
DISTINCT
to find unique aggregate function valuesIn a less far-fetched example, you might be interested in the DISTINCT
aggregated values, such as, how many different duplicate numbers of actors are there?
SELECT DISTINCT COUNT(*)
FROM (VALUES('a'), ('a'), ('b'), ('b')) t(actors)
GROUP BY actors
Answer:
count
-----
2
DISTINCT
to remove duplicates with more than one GROUP BY
columnAnother case, of course, is this one:
SELECT DISTINCT actors, COUNT(*)
FROM (VALUES('a', 1), ('a', 1), ('b', 1), ('b', 2)) t(actors, id)
GROUP BY actors, id
With DISTINCT
:
actors count
-------------
a 2
b 1
Without DISTINCT
:
actors count
-------------
a 2
b 1
b 1
For more details, I've written some blog posts, e.g. about GROUPING SETS
and how they influence the GROUP BY
operation, or about the logical order of SQL operations (as opposed to the lexical order of operations).