Delete duplicate records from a SQL table without a primary key

Question

I have the below table with the below records in it  create table employee    EmpId number   EmpName varchar2 10    EmpSSN varchar2 11      insert into employee values 1   Jack    555-55-5555    insert into employee values  2   Joe    555-56-5555    insert into employee values  3   Fred    555-57-5555    insert into employee values  4   Mike    555-58-5555    insert into employee values  5   Cathy    555-59-5555    insert into employee values  6   Lisa    555-70-5555    insert into employee values  1   Jack    555-55-5555    insert into employee values  4   Mike    555-58-5555    insert into employee values  5   Cathy    555-59-5555    insert into employee values  6   Lisa    555-70-5555    insert into employee values  5   Cathy    555-59-5555    insert into employee values  6   Lisa    555-70-5555      I dont have any primary key in this table  But i have the above records in my table already  I want to remove the duplicate records which has the same value in EmpId and EmpSSN fields   Ex   Emp id 5   Can any one help me to frame a query to delete those duplicate records  Thanks in advance

User · Answer

there are two columns in the a table ID and name where names are repeating with different IDs so for that you may use this query       DELETE FROM dbo tbl1 WHERE id NOT IN        Select MIN Id  AS namecount FROM tbl1      GROUP BY Name

User · Answer

Add a Primary Key  code below   Run the correct delete  code below   Consider WHY you woudln t want to keep that primary key     Assuming MSSQL or compatible   ALTER TABLE Employee ADD EmployeeID int identity 1 1  PRIMARY KEY   WHILE EXISTS  SELECT COUNT    FROM Employee GROUP BY EmpID  EmpSSN HAVING COUNT     gt  1  BEGIN     DELETE FROM Employee WHERE EmployeeID IN                SELECT MIN EmployeeID  as  DeleteID          FROM Employee         GROUP BY EmpID  EmpSSN         HAVING COUNT     gt  1       END

User · Answer

I m not an SQL expert so bear with me  I m sure you ll get a better answer soon enough  Here s how you can find the duplicate records   select t1 empid  t1 empssn  count    from employee as t1  inner join employee as t2 on  t1 empid t2 empid and t1 empssn   t2 empssn  group by t1 empid  t1 empssn having count     gt  1   Deleting them will be more tricky because there is nothing in the data that you could use in a delete statement to differentiate the duplicates  I suspect the answer will involve row number   or adding an identity column

User · Answer

select distinct   into newtablename from oldtablename   Now  the newtablename will have no duplicate records   Simply change the table name newtablename  by pressing F2 in object explorer in sql server

User · Answer

Use the row number to differentiate between duplicate records  Keep the first row number for an EmpID EmpSSN and delete the rest       DELETE FROM Employee a      WHERE ROW NUMBER    lt  gt    SELECT MIN  ROW NUMBER                                    FROM Employee b                               WHERE a EmpID    b EmpID                                 AND a EmpSSN   b EmpSSN

User · Answer

create unique clustered index Employee idx on Employee   EmpId EmpSSN   with ignore dup key  You can drop the index if you don t need it

User · Answer

ALTER IGNORE TABLE test            ADD UNIQUE INDEX  test    b         here  b  is column name to uniqueness    here  test  is index name

User · Answer

Code  DELETE DUP  FROM         SELECT ROW NUMBER   OVER  PARTITION BY Clientid ORDER BY Clientid   AS Val      FROM ClientMaster    DUP  WHERE DUP Val  gt  1   Explanation  Use an inner query to construct a view over the table which includes a field based on Row Number    partitioned by those columns you wish to be unique     Delete from the results of this inner query  selecting anything which does not have a row number of 1  i e  the duplicates  not the original   The order by clause of the row number window function is needed for a valid syntax  you can put any column name here   If you wish to change which of the results is treated as a duplicate  e g  keep the earliest or most recent  etc   then the column s  used here do matter  i e  you want to specify the order such that the record you wish to keep will come first in the result

User · Answer

delete from employee where rowid in  select rowid from  select rowid  name count from  select rowid  count emp name  as name count from employee group by emp id  emp name  where name count gt 1

User · Answer

You could create a temporary table  tempemployee containing a select distinct of your employee table  Then delete from employee  Then insert into employee select from  tempemployee   Like Josh said - even if you know the duplicates  deleting them will be impossile since you cannot actually refer to a specific record if it is an exact duplicate of another record

User · Answer

If you don t want to create a new primary key you can use the TOP command in SQL Server   declare  ID int while EXISTS select count    from Employee group by EmpId having count    gt  1  begin     select top 1  ID   EmpId     from Employee      group by EmpId     having count     gt  1      DELETE TOP 1  FROM Employee WHERE EmpId    ID end

User · Answer

ITS easy use below query   WITH Dups AS     SELECT col1 col2 col3  ROW NUMBER   OVER PARTITION BY col1 col2 col3 ORDER BY  SELECT 0   AS rn  FROM mytable   DELETE FROM Dups WHERE rn  gt  1

User · Answer

It is very simple  I tried in SQL Server 2008  DELETE SUB FROM  SELECT ROW NUMBER   OVER  PARTITION BY EmpId  EmpName  EmpSSN ORDER BY EmpId  cnt  FROM Employee  SUB WHERE SUB cnt  gt  1

User · Answer

select t1   from employee t1  employee t2 where t1 empid t2 empid and t1 empname   t2 empname and t1 salary   t2 salary group by t1 empid  t1 empname t1 salary having count     gt  1

User · Answer

delete sub from  select ROW NUMBER   OVer Partition by empid order by empid cnt from employee sub where sub cnt 1

User · Answer

no ID  no rowcount   or no temp table needed      WHILE           SELECT  COUNT          FROM TBLEMP        WHERE EMPNO              IN  SELECT empno  from tblemp group by empno having count empno  gt 1    gt  1    DELETE top 1    FROM TBLEMP  WHERE EMPNO IN  SELECT empno  from tblemp group by empno having count empno  gt 1

User · Answer

DELETE FROM  test   USING  test     test  as vtable WHERE test id gt vtable id and test common column vtable common column     Using this we can remove duplicate records

User · Answer

Having a database table without Primary Key is really and will say extremely BAD PRACTICE   so after you add one  ALTER TABLE   Run this until you don t see any more duplicated records  that is the purpose of HAVING COUNT   DELETE FROM  TABLE NAME  WHERE  Id  IN        SELECT MAX  Id       FROM  TABLE NAME      GROUP BY  TARGET COLUMN      HAVING COUNT     gt  1     SELECT MAX  Id    TABLE NAME   COUNT    AS dupeCount FROM  TABLE NAME  GROUP BY  TABLE NAME  HAVING COUNT     gt  1   MAX  Id   will cause to delete latest records  ones added after first created  in case you want the opposite meaning that in case of requiring deleting first records and leave the last record inserted please use MIN  Id

User · Answer

With duplicates  As  Select    ROW NUMBER   Over  PARTITION by EmpID EmpSSN Order by EmpID EmpSSN  as Duplicate From Employee   delete From duplicates  Where Duplicate  gt  1     This will update Table and remove all duplicates from the Table

[sql] Delete duplicate records from a SQL table without a primary key

Examples related to sql

Examples related to sql-server-2005

Examples related to tsql

Examples related to duplicate-removal