A new addition to the DELETE command in SQL Server 2005 is the TOP statement. The DELETE TOP does the same thing as a SELECT TOP WHERE only the TOP number of rows are deleted. This can be very helpful when there are duplicate rows of data present.
This would delete one of the duplicate rows for Customer number 1 Suppose somehow the whole customer table got duplicated. I duplicated the Sales.Customer table into a tmpCustomer table.
While this worked just fine, it ran about 4 minutes for 38K rows. Let's try the dreaded CURSOR. Notice I can stick a variable in where the TOP () statement is. I subtracted -1 because we don't want to delete every row.
This ran much better at 18 seconds. Enjoy.
DELETE TOP (1) FROM Sales.Customer WHERE CustomerID = 1
This would delete one of the duplicate rows for Customer number 1 Suppose somehow the whole customer table got duplicated. I duplicated the Sales.Customer table into a tmpCustomer table.
SELECT Top 1 CustomerID, COUNT(CustomerID) AS Cnt
FROM tmpCustomer
GROUP BY CustomerID
HAVING COUNT(CustomerID) > 1
WHILE @@RowCount > 0
BEGIN
DELETE Top (1)
FROM tmpCustomer
WHERE CustomerID = (SELECT Top (1) CustomerID
FROM tmpCustomer
GROUP BY CustomerID
HAVING COUNT(CustomerID) > 1)
END
While this worked just fine, it ran about 4 minutes for 38K rows. Let's try the dreaded CURSOR. Notice I can stick a variable in where the TOP () statement is. I subtracted -1 because we don't want to delete every row.
DECLARE @cnt int, @custID as int
DECLARE dupCursor CURSOR FAST_FORWARD
FOR SELECT CustomerID, COUNT(CustomerID) AS Cnt
FROM tmpCustomer
GROUP BY CustomerID
HAVING COUNT(CustomerID) > 1
OPEN dupCursor
FETCH NEXT FROM dupCursor
INTO @custID, @cnt
WHILE @@FETCH_STATUS = 0
BEGIN
DELETE Top (@cnt-1)
FROM tmpCustomer
WHERE CustomerID = @custID
FETCH NEXT FROM dupCursor
INTO @custID, @cnt
END
CLOSE dupCursor
DEALLOCATE dupCursor
This ran much better at 18 seconds. Enjoy.
Comments
Post a Comment