Finding duplicate rows in SQL Server

Question

I have a SQL Server database of organizations  and there are many duplicate rows  I want to run a select statement to grab all of these and the amount of dupes  but also return the ids that are associated with each organization   A statement like     SELECT     orgName  COUNT    AS dupes   FROM         organizations   GROUP BY orgName   HAVING       COUNT     gt  1    Will return something like  orgName          dupes   ABC Corp         7   Foo Federation   5   Widget Company   2    But I d also like to grab the IDs of them  Is there any way to do this  Maybe like a   orgName          dupeCount   id   ABC Corp         1           34   ABC Corp         2           5         Widget Company   1           10   Widget Company   2           2     The reason being that there is also a separate table of users that link to these organizations  and I would like to unify them  therefore remove dupes so the users link to the same organization instead of dupe orgs   But I would like part manually so I don t screw anything up  but I would still need a statement returning the IDs of all the dupe orgs so I can go through the list of users

User · Accepted Answer

select o orgName  oc dupeCount  o id from organizations o inner join       SELECT orgName  COUNT    AS dupeCount     FROM organizations     GROUP BY orgName     HAVING COUNT     gt  1   oc on o orgName   oc orgName

User · Answer

To get duplicate data in table      SELECT COUNT EmpCode  EmpCode FROM tbl Employees WHERE Status 1    GROUP BY EmpCode HAVING COUNT EmpCode   gt  1

User · Answer

select   from  Employees      For finding duplicate Record  1 Using CTE  with mycte as   select Name EmailId ROW NUMBER   over partition by Name EmailId order by id  as Duplicate from  Employees    select   from mycte     2 By Using GroupBy  select Name EmailId COUNT name  as Duplicate from   Employees  group by Name EmailId

User · Answer

Select   from  Select orgName id  ROW NUMBER   OVER Partition By OrgName ORDER by id DESC  Rownum From organizations  tbl Where Rownum gt 1   So the records with rowum  1 will be the duplicate records in your table     Partition by    first group by the records and then serialize them by giving them serial nos  So rownum  1 will be the duplicate records which could be deleted as such

User · Answer

You can do it like this   SELECT     o id  o orgName  d intCount FROM        SELECT orgName  COUNT    as intCount      FROM organizations      GROUP BY orgName      HAVING COUNT     gt  1   AS d     INNER JOIN organizations o ON o orgName   d orgName   If you want to return just the records that can be deleted  leaving one of each   you can use   SELECT     id  orgName FROM        SELECT           orgName  id           ROW NUMBER   OVER  PARTITION BY orgName ORDER BY id  AS intRow      FROM organizations   AS d WHERE intRow    1   Edit  SQL Server 2000 doesn t have the ROW NUMBER   function  Instead  you can use   SELECT     o id  o orgName  d intCount FROM        SELECT orgName  COUNT    as intCount  MIN id  AS minId      FROM organizations      GROUP BY orgName      HAVING COUNT     gt  1   AS d     INNER JOIN organizations o ON o orgName   d orgName WHERE d minId    o id

User · Answer

The solution marked as correct didn t work for me  but I found this answer that worked just great  Get list of duplicate rows in MySql  SELECT n1    FROM myTable n1 INNER JOIN myTable n2  ON n2 repeatedCol   n1 repeatedCol WHERE n1 id  lt  gt  n2 id

User · Answer

select column name  count column name  from table name group by column name having count  column name   gt  1    Src   https   stackoverflow com a 59242 1465252

User · Answer

i think i know what you need i needed to mix between the answers and i think i got the solution he wanted   select o id o orgName  oc dupeCount  oc id oc orgName from organizations o inner join       SELECT MAX id  as id  orgName  COUNT    AS dupeCount     FROM organizations     GROUP BY orgName     HAVING COUNT     gt  1   oc on o orgName   oc orgName   having the max id will give you the id of the dublicate and the one of the original which is what he asked for    id org name   dublicate count  missing out in this case   id doublicate org name   doub count  missing out again because does not help in this case    only sad thing you get it put out in this form  id   name   dubid   name   hope it still helps

User · Answer

You can try this   it is best for you    WITH CTE AS           SELECT   RN ROW NUMBER   OVER  PARTITION BY orgName ORDER BY orgName DESC  FROM organizations            select   from CTE where RN gt 1     go

User · Answer

You have several way for Select duplicate rows   for my solutions   first consider this table for example  CREATE TABLE  Employee   ID          INT  FIRST NAME  NVARCHAR 100   LAST NAME   NVARCHAR 300     INSERT INTO  Employee VALUES   1   Ardalan    Shahgholi     INSERT INTO  Employee VALUES   2   name1    lname1     INSERT INTO  Employee VALUES   3   name2    lname2     INSERT INTO  Employee VALUES   2   name1    lname1     INSERT INTO  Employee VALUES   3   name2    lname2     INSERT INTO  Employee VALUES   4   name3    lname3       First solution    SELECT DISTINCT   FROM    Employee   WITH  DeleteEmployee AS                        SELECT ROW NUMBER                               OVER PARTITION BY ID  First Name  Last Name ORDER BY ID  AS                             RNUM                      FROM    Employee                     SELECT   FROM    DeleteEmployee WHERE  RNUM  gt  1  SELECT DISTINCT   FROM    Employee   Secound solution   Use identity field  SELECT DISTINCT   FROM    Employee   ALTER TABLE  Employee ADD UNIQ ID INT IDENTITY 1  1   SELECT   FROM    Employee WHERE  UNIQ ID  lt        SELECT MAX UNIQ ID      FROM    Employee a2     WHERE   Employee ID   a2 ID            AND  Employee FIRST NAME   a2 FIRST NAME            AND  Employee LAST NAME   a2 LAST NAME    ALTER TABLE  Employee DROP COLUMN UNIQ ID  SELECT DISTINCT   FROM    Employee   and end of all solution use this command  DROP TABLE  Employee

User · Answer

Try  SELECT orgName  id  count    as dupes FROM organizations GROUP BY orgName  id HAVING count     gt  1

User · Answer

Suppose we have table the table  Student  with 2 columns     student id int student name varchar  Records   ------------ ---------------------    student id   student name           ------------ ---------------------           101   usman                          101   usman                          101   usman                          102   usmanyaqoob                    103   muhammadusmanyaqoob            103   muhammadusmanyaqoob    ------------ ---------------------     Now we want to see duplicate records Use this query   select student name student id  count    c from student group by student id student name having c gt 1       --------------------- ------------ ---    student name          student id   c    --------------------- ------------ ---    usman                        101   3     muhammadusmanyaqoob          103   2    --------------------- ------------ ---

User · Answer

select a orgName b duplicate  a id from organizations a inner join       SELECT orgName  COUNT    AS duplicate     FROM organizations     GROUP BY orgName     HAVING COUNT     gt  1   b on o orgName   oc orgName group by a orgName a id

User · Answer

If you want to delete duplicates   WITH CTE AS     SELECT orgName id         RN   ROW NUMBER  OVER PARTITION BY orgName ORDER BY Id     FROM organizations   DELETE FROM CTE WHERE RN  gt  1

User · Answer

I got a better option to get the duplicate records in a table  SELECT x studid  y stdname  y dupecount FROM student AS x INNER JOIN  SELECT a stdname  COUNT    AS dupecount FROM student AS a INNER JOIN studmisc AS b ON a studid   b studid WHERE  a studid LIKE  2018    AND  b studstatus   4  GROUP BY a stdname HAVING  COUNT     gt  1   AS y ON x stdname   y stdname INNER JOIN studmisc AS z ON x studid   z studid WHERE  x studid LIKE  2018    AND  z studstatus   4  ORDER BY x stdname     Result of the above query shows all the duplicate names with unique student ids and number of duplicate occurances  Click here to see the result of the sql

User · Answer

You can run the following query and find the duplicates with max id  and delete those rows   SELECT orgName  COUNT     Max ID  AS dupes  FROM organizations  GROUP BY orgName  HAVING  COUNT     gt  1    But you ll have to run this query a few times

User · Answer

select orgname  count    as dupes  id  from organizations where orgname in       select orgname     from organizations     group by orgname     having  count     gt  1    group by orgname  id

User · Answer

I use two methods to find duplicate rows   1st method is the most famous one using group by and having  2nd method is using CTE - Common Table Expression   As mentioned by  RedFilter this way is also right  Many times I find CTE method is also useful for me   WITH TempOrg  orgName RepeatCount  AS   SELECT orgName ROW NUMBER   OVER PARTITION by orgName ORDER BY orgName   AS RepeatCount FROM dbo organizations   select t   e id from organizations   e inner join TempOrg t on t orgName  e orgName where t RepeatCount gt 1   In the example above we collected the result by finding repeat occurrence using ROW NUMBER and PARTITION BY  Then we applied where clause to select only rows which are on repeat count more than 1  All the result is collected CTE table and joined with Organizations table   Source   CodoBee

[sql] Finding duplicate rows in SQL Server

Examples related to sql

Examples related to sql-server

Examples related to duplicates