Skip to main content

Posts

Showing posts from 2011

Delete Duplicate Rows in SQL Server 2005

A new addition to the DELETE command in SQL Server 2005 is the TOP statement. The DELETE TOP does the same thing as a SELECT TOP WHERE only the TOP number of rows are deleted. This can be very helpful when there are duplicate rows of data present.
DELETE TOP (1) FROM Sales.Customer WHERE CustomerID = 1


This would delete one of the duplicate rows for Customer number 1 Suppose somehow the whole customer table got duplicated. I duplicated the Sales.Customer table into a tmpCustomer table.


SELECT Top 1 CustomerID, COUNT(CustomerID) AS Cnt FROM tmpCustomer GROUP BY CustomerID HAVING COUNT(CustomerID) > 1 WHILE @@RowCount > 0 BEGIN DELETE Top (1) FROM tmpCustomer WHERE CustomerID = (SELECT Top (1) CustomerID FROM tmpCustomer GROUP BY CustomerID HAVING COUNT(CustomerID) > 1) END


While this worked just fine, it ran about 4 minutes for 38K rows. Let's try the dreaded CURSOR. Notic…