A new addition to the DELETE command in SQL Server 2005 is the TOP statement. The DELETE TOP does the same thing as a SELECT TOP WHERE only the TOP number of rows are deleted. This can be very helpful when there are duplicate rows of data present.
This would delete one of the duplicate rows for Customer number 1 Suppose somehow the whole customer table got duplicated. I duplicated the Sales.Customer table into a tmpCustomer table.
While this worked just fine, it ran about 4 minutes for 38K rows. Let's try the dreaded CURSOR. Notice I can stick a variable in where the TOP () statement is. I subtracted -1 because we don't want to delete every row.
This ran much better at 18 seconds. Enjoy.
DELETE TOP (1) FROM Sales.Customer WHERE CustomerID = 1
This would delete one of the duplicate rows for Customer number 1 Suppose somehow the whole customer table got duplicated. I duplicated the Sales.Customer table into a tmpCustomer table.
SELECT Top 1 CustomerID, COUNT(CustomerID) AS Cnt FROM tmpCustomer GROUP BY CustomerID HAVING COUNT(CustomerID) > 1 WHILE @@RowCount > 0 BEGIN DELETE Top (1) FROM tmpCustomer WHERE CustomerID = (SELECT Top (1) CustomerID FROM tmpCustomer GROUP BY CustomerID HAVING COUNT(CustomerID) > 1) END
While this worked just fine, it ran about 4 minutes for 38K rows. Let's try the dreaded CURSOR. Notice I can stick a variable in where the TOP () statement is. I subtracted -1 because we don't want to delete every row.
DECLARE @cnt int, @custID as int DECLARE dupCursor CURSOR FAST_FORWARD FOR SELECT CustomerID, COUNT(CustomerID) AS Cnt FROM tmpCustomer GROUP BY CustomerID HAVING COUNT(CustomerID) > 1 OPEN dupCursor FETCH NEXT FROM dupCursor INTO @custID, @cnt WHILE @@FETCH_STATUS = 0 BEGIN DELETE Top (@cnt-1) FROM tmpCustomer WHERE CustomerID = @custID FETCH NEXT FROM dupCursor INTO @custID, @cnt END CLOSE dupCursor DEALLOCATE dupCursor
This ran much better at 18 seconds. Enjoy.
Comments
Post a Comment