Find duplicate records in a table using SQL Server

Question

I am validating a table which has a transaction level data of an eCommerce site and find the exact errors   I want your help to find duplicate records in a 50 column table on SQL Server   Suppose my data is   OrderNo shoppername amountpayed city Item        1       Sam         10          A    Iphone 1       Sam         10          A    Iphone--- gt  gt Duplication to be detected 1       Sam         5           A    Ipod 2       John        20          B    Macbook 3       John        25          B    Macbookair 4       Jack        5           A    Ipod   Suppose I use the below query   Select shoppername count    as cnt from dbo sales having count     gt  1 group by shoppername   will return me   Sam  2 John 2   But I don t want to find duplicate just over 1 or 2 columns  I want to find the duplicate over all the columns together in my data  I want the result as   1       Sam         10          A    Iphone

User · Answer

SELECT OrderNo  shoppername  amountPayed  city  item  count    as cnt FROM dbo sales GROUP BY OrderNo  shoppername  amountPayed  city  item HAVING COUNT     gt  1

User · Answer

First of all  I doubt that the result it not accurate  Seem like there are Three  Sam  from the original table  But it is not critical to the question   Then here we come for the question itself  Based on your table  the best way to show duplicate value is to use count    and Group by clause  The query would look like this  SELECT OrderNo  shoppername  amountPayed  city  item  count    as RepeatTimes            FROM dbo sales                                                                             GROUP BY OrderNo  shoppername  amountPayed  city  item                               HAVING COUNT     gt  1  The reason is that all columns together from your table uniquely identified each record  which means the records will be considered as duplicate only when all values from each column are exactly the same  also you want to show all fields for duplicate records  so the group by will not miss any column  otherwise yes because you can only select columns that participate in the  group by  clause    Now I would like to give you any example for With   Row Number  Over       which is using table expression together with Row Number function   Suppose you have a nearly same table but with one extra column called Shipping Date  and the value may change even the rest are the same  Here it is   OrderNo shoppername amountpayed city Item        Shipping Date 1       Sam         10          A    Iphone       2016-01-01 1       Sam         10          A    Iphone       2016-02-02  1       Sam         5           A    Ipod         2016-03-03 2       John        20          B    Macbook      2016-04-04 3       John        25          B    Macbookair   2016-05-05 4       Jack        5           A    Ipod         2016-06-06       Notice that row  2 is not a duplicate one if you still take all columns as a unit  But what if you want to treat them as duplicate as well in this case  You should use With   Row Number  Over       and the query would look like this   WITH TABLEEXPRESSION  AS  SELECT    ROW NUMBER   OVER  PARTITION BY OrderNo  shoppername  amountPayed  city  item ORDER BY  Shipping Date  as Identifier  --if you consider the one with late shipping date as the duplicate FROM dbo sales  SELECT   FROM TABLEEXPRESSION WHERE Identifier   1 --or use   gt 1   The above query will give result together with Shipping Date  for example    OrderNo shoppername amountpayed city Item        Shipping Date   Identifier 1       Sam         10          A    Iphone       2016-02-02          2   Note this one is different from the one with 2016-01-01  and the reason why 2016-02-02 has been filtered out is PARTITION BY OrderNo  shoppername  amountPayed  city  item ORDER BY  Shipping Date  as Identifier  and Shipping Date is NOT one of the column that need to be took care of for duplicate records  which means the one with 2016-02-02 still could be a perfect result for your question   Now summarize it little bit  using count    and Group by clause together is the best choice when you only want to show all columns from Group byclause as the result  otherwise you will miss the columns that do not participate in group by   While For With   Row Number  Over       it is suitable in every scenario that you want to find duplicate records  however  it is little bit complicated to write the query and little bit over engineered compared to the former one   If your purpose is to delete duplicate records from table  you have to use the later WITH   ROW NUMBER  OVER        DELETE FROM   WHERE one   Hope this helps

User · Answer

with x as   select shoppername count shoppername                from sales               having count shoppername  gt 1             group by shoppername  select t   from x win gp pin1510 t where x shoppername t shoppername order by t shoppername

User · Answer

You can use below methods to find the output   with Ctec AS    select   Row number   over partition by name order by Name Rnk  from Table A   select  Name from ctec where rnk gt 1  select name from Table A  group by name  having count    gt 1

User · Answer

Select   from dbo sales group by shoppername having count Item    1

User · Answer

with x as    select    rn   row number               over PARTITION BY OrderNo item  order by OrderNo              from     temp1   select   from x where rn  gt  1   you can remove duplicates by replacing select statement by   delete x where rn  gt  1

User · Answer

SQL gt  SELECT JOB COUNT JOB  FROM EMP GROUP BY JOB   JOB       COUNT JOB  --------- ---------- ANALYST            2 CLERK              4 MANAGER            3 PRESIDENT          1 SALESMAN           4

User · Answer

Just add all fields to the query and remember to add them to Group By as well   Select shoppername  a  b  amountpayed  item  count    as cnt from dbo sales group by shoppername  a  b  amountpayed  item having count     gt  1

User · Answer

Select EventID count   as cnt from dbo EventInstances  group by EventID having count     1

User · Answer

Try this instead  SELECT MAX shoppername   COUNT    AS cnt FROM dbo sales GROUP BY CHECKSUM    HAVING COUNT     gt  1   Read about the CHECKSUM function first  as there can be duplicates

User · Answer

To get the list of multiple records use following command  select field1 field2 field3  count      from table name   group by field1 field2 field3   having count     gt  1

User · Answer

Try this  with T1 AS   SELECT LASTNAME  COUNT 1  AS  COUNT  FROM Employees GROUP BY LastName HAVING  COUNT 1   gt  1   SELECT E   T1  COUNT  FROM Employees E INNER JOIN T1 ON T1 LastName   E LastName

User · Answer

The following is running code   SELECT abnno  COUNT abnno  FROM tbl Name GROUP BY abnno HAVING   COUNT abnno   gt  1

[sql] Find duplicate records in a table using SQL Server

Examples related to sql

Examples related to sql-server

Examples related to sql-server-2005