When I perform "SELECT * FROM table" I got results like below:
1 item1 data1
2 item1 data2
3 item2 data3
4 item3 data4
As you can see, there are dup records from column2 (item1 are dupped). So how could I just get result like this:
1 item1 data1
2 item2 data3
3 item3 data4
Only one record are returned from the duplicate, along with the rest of the unique records.
Select Eff_st from ( select EFF_ST,ROW_NUMBER() over(PARTITION BY eff_st) XYZ - from ABC.CODE_DIM
) where XYZ= 1 order by EFF_ST fetch first 5 row only
just use inner join because group by won't work with multiple columns saying not contained in either an aggregate function.
SELECT a.*
FROM yourtable a
INNER JOIN
(SELECT yourcolumn,
MIN(id) as id
FROM yourtable
GROUP BY yourcolumn
) AS b
ON a.yourcolumn= b.yourcolumn
AND a.id = b.id;
If you only need to remove duplicates then use DISTINCT
. GROUP BY
should be used to apply aggregate operators to each group
To get all the columns in your result you need to place something as:
SELECT distinct a, Table.* FROM Table
it will place a as the first column and the rest will be ALL of the columns in the same order as your definition. This is, column a will be repeated.
I find that if I can't use DISTINCT for any reason, then GROUP BY will work.
There are 4 methods you can use:
Consider the following sample TABLE
with test data:
/** Create test table */
CREATE TEMPORARY TABLE dupes(word text, num int, id int);
/** Add test data with duplicates */
INSERT INTO dupes(word, num, id)
VALUES ('aaa', 100, 1)
,('bbb', 200, 2)
,('ccc', 300, 3)
,('bbb', 400, 4)
,('bbb', 200, 5) -- duplicate
,('ccc', 300, 6) -- duplicate
,('ddd', 400, 7)
,('bbb', 400, 8) -- duplicate
,('aaa', 100, 9) -- duplicate
,('ccc', 300, 10); -- duplicate
This is the most simple and straight forward, but also the most limited way:
SELECT DISTINCT word, num
FROM dupes
ORDER BY word, num;
/*
word|num|
----|---|
aaa |100|
bbb |200|
bbb |400|
ccc |300|
ddd |400|
*/
Grouping allows you to add aggregated data, like the min(id)
, max(id)
, count(*)
, etc:
SELECT word, num, min(id), max(id), count(*)
FROM dupes
GROUP BY word, num
ORDER BY word, num;
/*
word|num|min|max|count|
----|---|---|---|-----|
aaa |100| 1| 9| 2|
bbb |200| 2| 5| 2|
bbb |400| 4| 8| 2|
ccc |300| 3| 10| 3|
ddd |400| 7| 7| 1|
*/
Using a subquery, you can first identify the duplicate rows to ignore, and then filter them out in the outer query with the WHERE NOT IN (subquery)
construct:
/** Find the higher id values of duplicates, distinct only added for clarity */
SELECT distinct d2.id
FROM dupes d1
INNER JOIN dupes d2 ON d2.word=d1.word AND d2.num=d1.num
WHERE d2.id > d1.id
/*
id|
--|
5|
6|
8|
9|
10|
*/
/** Use the previous query in a subquery to exclude the dupliates with higher id values */
SELECT *
FROM dupes
WHERE id NOT IN (
SELECT d2.id
FROM dupes d1
INNER JOIN dupes d2 ON d2.word=d1.word AND d2.num=d1.num
WHERE d2.id > d1.id
)
ORDER BY word, num;
/*
word|num|id|
----|---|--|
aaa |100| 1|
bbb |200| 2|
bbb |400| 4|
ccc |300| 3|
ddd |400| 7|
*/
In the Common Table Expression (CTE), select the ROW_NUMBER(), partitioned by the group column and ordered in the desired order. Then SELECT only the records that have ROW_NUMBER() = 1
:
WITH CTE AS (
SELECT *
,row_number() OVER(PARTITION BY word, num ORDER BY id) AS row_num
FROM dupes
)
SELECT word, num, id
FROM cte
WHERE row_num = 1
ORDER BY word, num;
/*
word|num|id|
----|---|--|
aaa |100| 1|
bbb |200| 2|
bbb |400| 4|
ccc |300| 3|
ddd |400| 7|
*/
It depends on which rown you want to return for each unique item. Your data seems to indicate the minimum data value so in this instance for SQL Server.
SELECT item, min(data)
FROM table
GROUP BY item
Source: Stackoverflow.com