If you are dealing with duplicate rows in a SQL server database, and you’re not sure how to find them, don’t panic. I had to deal with a similar situation recently.
Two weeks ago, I was working on a project where I had to import some data from a legacy system into a whole new table. After the data had been loaded and a report was generated with that data, we’ve noticed there were some duplicate entries. Therefore, I had to quickly figure out how to put together a quick query to find duplicate rows and removed them from the table.
In this post I will show simple steps to take in case you are facing the same issue. In order to get this post, I am going to create a new table and populate it with some seed data.
DROP TABLE IF EXISTS Students;
CREATE TABLE Students (
Id INT IDENTITY(1, 1),
Populate the table with data
Now that we have our table added, I will proceed to populate it with some data for our demonstration.
INSERT INTO dbo.Students
Ensure the data was inserted properly
Okay, we have the data we need in the table, let’s run a quick query to see if it match our criteria, in order words have some duplicates. Let’s run a simple select statement to retrieve the full data-set in the the table.
SELECT Id, First_Name, Last_Name,Phone
ORDER BY First_Name, Last_Name desc;
Find duplicate rows in the table
As a matter of fact, there are many ways to find and return the duplicates. I ended up using group by to get my result, If you end up using some other method, please share it in the comment below.
COUNT(*) > 1;
Last, below you can see the result of the duplicated rows found and the number of time.