Monday, June 14, 2010

How will you delete duplicate rows from a table?

DELETE FROM our_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM our_table
GROUP BY column1, column2, column3... ;Here column1, column2, column3 constitute the identifying key for each record.

Be sure to replace our_table with the table name from which you want to remove the duplicate rows. The GROUP BY is used on the columns that make the primary key for the table. This script deletes each row in the group after the first row.

e.g.

DELETE FROM duptest WHERE rowid NOT IN (SELECT max(rowid) FROM duptest GROUP BY empid);

Explanation :- In the subquery I Select a unique rowid from all the rowid's It can be MAX, MIN any one and the group by clause should include all the UNIQUE columns I desire. Say I want a composite primary key then the group by should be group by id1,id2 ...

The the subquery returns one record for the dupliacte ID's and I delete all of them that are not in the subquery and that deletes all the dupliacte rows from the database.

No comments:

Post a Comment