Delete Duplicate Rows in Table without Primary Key

In the previous article, I showed a way how to delete duplicate rows when Primary Key exists in the table. Now I want to demonstrate how to do the same thing but in case the table does not contain a primary key.
I prepared an MS SQL statement for those who want quickly copy and paste sample and use it on their own database:

DELETE deleteSub FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY FirstName, LastName ORDER BY FirstName, LastName) RowId, FirstName, LastName FROM .) deleteSub
WHERE deleteSub.RowId > 1

And for those who want to understand in more detail, please study the following example.

Step 1 - Create a Test Table

Here is the SQL for our test table. Note that this table does not have a primary key. It is a bad practice to construct such kind of table but anyway let's do it.

SET ANSI_NULLS ON
GO SET QUOTED_IDENTIFIER ON
GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'.') AND type in (N'U'))
BEGIN
CREATE TABLE .( (50) NOT NULL, (50) NOT NULL, (7) NOT NULL
) ON END
GO

Step 2 - Insert Test Data

INSERT INTO . (, , ) VALUES ('Krista', 'Montes', '1992-06-18') INSERT INTO . (, , ) VALUES ('Krista', 'Montes', '1992-06-18') INSERT INTO . (, , ) VALUES ('Christian', 'Reyes', '1998-05-03') INSERT INTO . (, , ) VALUES ('Christian', 'Reyes', '1998-05-03') INSERT INTO . (, , ) VALUES ('Christian', 'Reyes', '1998-05-03') INSERT INTO . (, , ) VALUES ('Christian', 'Volan', '1993-04-09') INSERT INTO . (, , ) VALUES ('Marvin', 'Lynn', '2001-07-08')

You can see that 'Krista Montes' was inserted 2 times and 'Christian Reyes' 3 times. Also, I specially added 'Christian Volan' to highlight that we want to remove duplication by the combination of the 'FirstName' and 'LastName' columns.

check duplication in table

Step 3 - Prepare Data for Deletion Using ROW_NUMBER and PARTITION BY

SELECT ROW_NUMBER() OVER (PARTITION BY FirstName, LastName ORDER BY FirstName, LastName) RowId, FirstName, LastName
FROM .

I think it is important to understand how ROW_NUMBER and PARTITION BY work with our data. The 'PARTITION BY FirstName, LastName' divided our data into blocks where the same FirstName and LastName. And ROW_NUMBER() numbered each data block.

Step 3 - Delete Duplicate Rows Except One

Now if you wrap this statement into a SUB query you can apply the 'DELETE' command with 'WHERE RowId > 1'

DELETE deleteSub FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY FirstName, LastName ORDER BY FirstName, LastName) RowId, FirstName, LastName FROM .) deleteSub
WHERE deleteSub.RowId > 1

My Video Demonstration: