Finding duplicate values in a SQL table

Question

It s easy to find duplicates with one field   SELECT name  COUNT email   FROM users GROUP BY email HAVING COUNT email   gt  1   So if we have a table  ID   NAME   EMAIL 1    John   asd asd com 2    Sam    asd asd com 3    Tom    asd asd com 4    Bob    bob asd com 5    Tom    asd asd com   This query will give us John  Sam  Tom  Tom because they all have the same email   However  what I want is to get duplicates with the same email and name   That is  I want to get  Tom    Tom    The reason I need this  I made a mistake  and allowed to insert duplicate name and email values  Now I need to remove change the duplicates  so I need to find them first

User · Accepted Answer

SELECT     name  email  COUNT    FROM     users GROUP BY     name  email HAVING      COUNT     gt  1   Simply group on both of the columns   Note  the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of  functional dependency       In relational database theory  a functional dependency is a constraint between two sets of attributes in a relation from a database  In other words  functional dependency is a constraint that describes the relationship between attributes in a relation    Support is not consistent    Recent PostgreSQL supports it  SQL Server  as at SQL Server 2017  still requires all non-aggregated columns in the GROUP BY  MySQL is unpredictable and you need sql mode only full group by    GROUP BY lname ORDER BY showing wrong results  Which is the least expensive aggregate function in the absence of ANY    see comments in accepted answer    Oracle isn t mainstream enough  warning  humour  I don t know about Oracle

User · Answer

To delete records whose names are duplicate   WITH CTE AS            SELECT ROW NUMBER   OVER  PARTITION BY name ORDER BY name  AS T FROM      YourTable        DELETE FROM CTE WHERE T  gt  1

User · Answer

Try the following   SELECT   FROM       SELECT Id  Name  Age  Comments  Row Number   OVER PARTITION BY Name  Age ORDER By Name          AS Rank          FROM Customers   AS B WHERE Rank gt 1

User · Answer

select id name COUNT    from user group by Id Name having COUNT    gt 1

User · Answer

SELECT name  email      FROM users     WHERE email in      SELECT email FROM users     GROUP BY email      HAVING COUNT    gt 1

User · Answer

If you want to find duplicate data  by one or several criterias  and select the actual rows   with MYCTE as       SELECT DuplicateKey1          DuplicateKey2 --optional          count    X     FROM MyTable     group by DuplicateKey1  DuplicateKey2     having count     gt  1    SELECT E   FROM MyTable E JOIN MYCTE cte ON E DuplicateKey1 cte DuplicateKey1     AND E DuplicateKey2 cte DuplicateKey2 ORDER BY E DuplicateKey1  E DuplicateKey2  CreatedAt   http   developer azurewebsites net 2014 09 better-sql-group-by-find-duplicate-data

User · Answer

In case you work with Microsoft Access  this way works  CREATE TABLE users  id int  name varchar 10   email varchar 50     INSERT INTO users VALUES  1   John    asd asd com    INSERT INTO users VALUES  2   Sam    asd asd com    INSERT INTO users VALUES  3   Tom    asd asd com    INSERT INTO users VALUES  4   Bob    bob asd com    INSERT INTO users VALUES  5   Tom    asd asd com     SELECT name  email  COUNT    AS CountOf FROM users GROUP BY name  email HAVING COUNT    gt 1   DELETE   FROM users WHERE id IN       SELECT u1 id      FROM users u1  users u2      WHERE u1 name   u2 name AND u1 email   u2 email AND u1 id  gt  u2 id     Thanks to Tancrede Chazallet for the delete code

User · Answer

How to get duplicate record in table   SELECT COUNT EmpCode  EmpCode FROM tbl Employees WHERE Status 1   GROUP BY EmpCode HAVING COUNT EmpCode   gt  1

User · Answer

select name  email   case  when ROW NUMBER    over  partition by name  email order by name   gt  1 then  Yes  else  No  end  duplicated    from users

User · Answer

SELECT name  email COUNT email   FROM users  WHERE email IN       SELECT email      FROM users      GROUP BY email      HAVING COUNT email   gt  1

User · Answer

You can use the SELECT DISTINCT keyword to get rid of duplicates  You can also filter by name and get everyone with that name on a table

User · Answer

You may want to try this  SELECT NAME  EMAIL  COUNT    FROM USERS GROUP BY 1 2 HAVING COUNT     gt  1

User · Answer

To Check From duplicate Record in a table   select   from users s  where rowid  lt  any   select rowid from users k where s name   k name and s email   k email     or  select   from users s  where rowid not in   select max rowid  from users k where s name   k name and s email   k email     To Delete the duplicate record in a table   delete from users s  where rowid  lt  any   select rowid from users k where s name   k name and s email   k email     or  delete from users s  where rowid not in   select max rowid  from users k where s name   k name and s email   k email

User · Answer

try this   declare  YourTable table  id int  name varchar 10   email varchar 50    INSERT  YourTable VALUES  1  John   John-email   INSERT  YourTable VALUES  2  John   John-email   INSERT  YourTable VALUES  3  fred   John-email   INSERT  YourTable VALUES  4  fred   fred-email   INSERT  YourTable VALUES  5  sam   sam-email   INSERT  YourTable VALUES  6  sam   sam-email    SELECT     name email  COUNT    AS CountOf     FROM  YourTable     GROUP BY name email     HAVING COUNT    gt 1   OUTPUT   name       email       CountOf ---------- ----------- ----------- John       John-email  2 sam        sam-email   2   2 row s  affected    if you want the IDs of the dups use this   SELECT     y id y name y email     FROM  YourTable y         INNER JOIN  SELECT                         name email  COUNT    AS CountOf                         FROM  YourTable                         GROUP BY name email                         HAVING COUNT    gt 1                       dt ON y name dt name AND y email dt email   OUTPUT   id          name       email ----------- ---------- ------------ 1           John       John-email 2           John       John-email 5           sam        sam-email 6           sam        sam-email   4 row s  affected    to delete the duplicates try   DELETE d     FROM  YourTable d         INNER JOIN  SELECT                         y id y name y email ROW NUMBER   OVER PARTITION BY y name y email ORDER BY y name y email y id  AS RowRank                         FROM  YourTable y                             INNER JOIN  SELECT                                             name email  COUNT    AS CountOf                                             FROM  YourTable                                             GROUP BY name email                                             HAVING COUNT    gt 1                                           dt ON y name dt name AND y email dt email                      dt2 ON d id dt2 id         WHERE dt2 RowRank  1 SELECT   FROM  YourTable   OUTPUT    id          name       email ----------- ---------- -------------- 1           John       John-email 3           fred       John-email 4           fred       fred-email 5           sam        sam-email   4 row s  affected

User · Answer

How we can count the duplicated values   either it is repeated 2 times or greater than 2  just count them  not group wise   as simple as   select COUNT distinct col 01  from Table 01

User · Answer

A little late to the party but I found a really cool workaround to finding all duplicate IDs   SELECT GROUP CONCAT  id   FROM users GROUP BY email HAVING   COUNT email   gt  1

User · Answer

In case you work with Oracle  this way would be preferable   create table my users id number  name varchar2 100   email varchar2 100     insert into my users values  1   John    asd asd com    insert into my users values  2   Sam    asd asd com    insert into my users values  3   Tom    asd asd com    insert into my users values  4   Bob    bob asd com    insert into my users values  5   Tom    asd asd com     commit   select     from my users  where rowid not in  select min rowid  from my users group by name  email

User · Answer

If you want to delete the duplicates  here s a much simpler way to do it than having to find even odd rows into a triple sub-select   SELECT id  name  email  FROM users u  users u2 WHERE u name   u2 name AND u email   u2 email AND u id  gt  u2 id   And so to delete   DELETE FROM users WHERE id IN       SELECT id    name  email       FROM users u  users u2     WHERE u name   u2 name AND u email   u2 email AND u id  gt  u2 id     Much more easier to read and understand IMHO  Note  The only issue is that you have to execute the request until there is no rows deleted  since you delete only 1 of each duplicate each time

User · Answer

This should also work  maybe give it try     Select   from Users a             where EXISTS  Select   from Users b                  where       a name   b name                          OR  a email   b email                       and a ID    b id    Especially good in your case If you search for duplicates who have some kind of prefix or general change like e g   new domain in mail  then you can use replace   at these columns

User · Answer

This is the easy thing I ve come up with   It uses a common table expression  CTE  and a partition window  I think these features are in SQL 2008 and later    This example finds all students with duplicate name and dob   The fields you want to check for duplication go in the OVER clause   You can include any other fields you want in the projection   with cte  StudentId  Fname  LName  DOB  RowCnt  as   SELECT StudentId  FirstName  LastName  DateOfBirth as DOB  SUM 1  OVER  Partition By FirstName  LastName  DateOfBirth  as RowCnt FROM tblStudent   SELECT   from CTE where RowCnt  gt  1 ORDER BY DOB  LName

User · Answer

The exact code would differ depending on whether you want to find duplicate rows as well or only different ids with the same email and name  If id is a primary key or otherwise has a unique constraint this distinction does not exist  but the question does not specify this  In the former case you can use code given in several other answers   SELECT name  email  COUNT    FROM users GROUP BY name  email HAVING COUNT     gt  1   In the latter case you would use    SELECT name  email  COUNT DISTINCT id  FROM users GROUP BY name  email HAVING COUNT DISTINCT id   gt  1 ORDER BY COUNT DISTINCT id  DESC

User · Answer

Another easy way you can try this using analytic function as well  SELECT   from    SELECT name  email   COUNT name  OVER  PARTITION BY name  email  cnt   FROM users   WHERE cnt  gt 1

User · Answer

SELECT id  COUNT id  FROM table1 GROUP BY id HAVING COUNT id  gt 1   I think this will work properly to search repeated values in a particular column

User · Answer

This selects deletes all duplicate records except one record from each group of duplicates  So  the delete leaves all unique records   one record from each group of the duplicates   Select duplicates   SELECT   FROM table WHERE     id NOT IN           SELECT MIN id          FROM table         GROUP BY column1  column2      Delete duplicates   DELETE FROM table WHERE     id NOT IN           SELECT MIN id          FROM table         GROUP BY column1  column2      Be aware of larger amounts of records  it can cause performance problems

User · Answer

We can use having here which work on aggregate functions as shown below  create table  TableB  id account int  data int   date  date  insert into  TableB values  1  -50   10 20 2018     1  20   10 09 2018     2  -900   10 01 2018     1  20   09 25 2018     1  -100   08 01 2018      SELECT id account   data  COUNT    FROM  TableB GROUP BY id account   data HAVING COUNT id account   gt  1  drop table  TableB   Here as two fields id account and data are used with Count     So  it will give all the records which has more than one times same values in both columns   We some reason mistakely we had missed to add any constraints in SQL server table and the records has been inserted duplicate in all columns with front-end application  Then we can use below query to delete duplicate query from table   SELECT DISTINCT   INTO  TemNewTable FROM  OriginalTable TRUNCATE TABLE  OriginalTable INSERT INTO  OriginalTable SELECT   FROM  TemNewTable DROP TABLE  TemNewTable   Here we have taken all the distinct records of the orignal table and deleted the records of original table  Again we inserted all the distinct values from new table to the original table and then deleted new table

User · Answer

Table structure  ID   NAME   EMAIL 1    John   asd asd com 2    Sam    asd asd com 3    Tom    asd asd com 4    Bob    bob asd com 5    Tom    asd asd com  Solution 1  SELECT           COUNT    FROM users t1 INNER JOIN users t2 WHERE t1 id  gt  t2 id   AND t1 name   t2 name   AND t1 email t2 email  Solution 2  SELECT name           email         COUNT    FROM users GROUP BY name           email HAVING COUNT     gt  1

User · Answer

If you wish to see if there is any duplicate rows in your table  I used below Query   create table my table id int  name varchar 100   email varchar 100     insert into my table values  1   shekh    shekh rms com    insert into my table values  1   shekh    shekh rms com    insert into my table values  2   Aman    aman rms com    insert into my table values  3   Tom    tom rms com    insert into my table values  4   Raj    raj rms com      Select COUNT 1  As Total Rows from my table  Select Count 1  As Distinct Rows from   Select Distinct   from my table  abc

User · Answer

SELECT   FROM users u where rowid    select max rowid  from users u1 where u email u1 email

User · Answer

Try this   SELECT name  email FROM users GROUP BY name  email HAVING   COUNT     gt  1

User · Answer

SELECT column name COUNT    FROM TABLE NAME GROUP BY column1  HAVING COUNT     gt  1

User · Answer

By Using CTE also we can find duplicate value like this  with MyCTE as   select Name EmailId ROW NUMBER   over PARTITION BY EmailId order by id  as Duplicate from  Employees     select   from MyCTE where Duplicate gt 1

User · Answer

select emp ename  emp empno  dept loc            from emp  inner join dept            on dept deptno emp deptno  inner join      select ename  count    from     emp     group by ename  deptno     having count     gt  1   t on emp ename t ename order by emp ename

User · Answer

try this code  WITH CTE AS    SELECT Id  Name  Age  Comments  RN   ROW NUMBER  OVER PARTITION BY Name Age ORDER BY ccn  FROM ccnmaster   select   from CTE

User · Answer

The most important thing here is to have the fastest function  Also indices of duplicates should be identified  Self join is a good option but to have a faster function it is better to first find rows that have duplicates and then join with original table for finding id of duplicated rows  Finally order by any column except id to have duplicated rows near each other   SELECT u   FROM users AS u JOIN  SELECT username  email       FROM users       GROUP BY username  email       HAVING COUNT    gt 1  AS w ON u username w username AND u email w email ORDER BY u email

[sql] Finding duplicate values in a SQL table

Examples related to sql

Examples related to duplicates