12 Dec 2025

Best SQL Query to Find Duplicates – Step by Step Guide

inding duplicate rows is one of the most common tasks in SQL, especially when working with real-world data like customer records, invoices, transactions, or log data.

In this article, you will learn the best SQL queries to identify duplicates, how to remove them, and how to prevent them in the future.

This is a beginner-friendly and practical tutorial you can use in SQL Server, MySQL, Oracle, PostgreSQL, and even BigQuery.

🔍 Why Finding Duplicates Is Important

Duplicate rows can create multiple problems:

Wrong totals in reports
Incorrect customer counts
Issues in billing
Slow performance
Errors in downstream systems (Power BI, ETL, APIs)

Cleaning duplicates is one of the first tasks in data quality improvement.

✅ 1. Find Duplicates Using GROUP BY (Best & Easiest Method)

This is the most common and universal method.

Query:


SELECT 
    column1,
    column2,
    COUNT(*) AS duplicate_count
FROM table_name
GROUP BY column1, column2
HAVING COUNT(*) > 1;

✔ When to use:

When you want to know which values are duplicated
Works in SQL Server, MySQL, Oracle, PostgreSQL, BigQuery

📌 Example: Find Duplicate Email IDs

Suppose you have a table:

id	name	email
1	Raj	raj@gmail.com
2	Priya	priya@gmail.com
3	Rajesh	raj@gmail.com
4	Anita	anita@gmail.com

The email raj@gmail.com is repeated.

Query:


SELECT 
    email,
    COUNT(*) AS duplicate_count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

Output:
email duplicate_count
 
        raj@gmail.com

email	duplicate_count
raj@gmail.com

🔥 3. Find Duplicates Using ROW_NUMBER() (Best for Deleting)

This is the most powerful method.
It uses window functions and gives a row number to each duplicate group.

Query:


WITH cte AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
    FROM users
)
SELECT *
FROM cte
WHERE rn > 1;

✔ When to use:

When you need to delete only duplicates and keep 1 record
When you want to mark duplicates

🗑 4. Delete Duplicate Rows (Keep Only One)

Using the same CTE:


WITH cte AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
    FROM users
)
DELETE FROM cte
WHERE rn > 1;

This deletes all duplicate rows but keeps the first row.

🧠 5. Find Duplicates by Date or Conditional Columns

Example: Find duplicates only for rows created today.


SELECT name, email, created_date, COUNT(*)
FROM users
WHERE created_date = CURRENT_DATE
GROUP BY name, email, created_date
HAVING COUNT(*) > 1;

This is useful for ETL loads and daily imports.

🛑 6. Prevent Duplicates Using Constraints

Best long-term solution.

In SQL Server / MySQL / PostgreSQL:


ALTER TABLE users
ADD CONSTRAINT uq_email UNIQUE (email);

In Oracle:


ALTER TABLE users
ADD CONSTRAINT uq_email UNIQUE (email);

After this, the database will automatically block duplicate values.

📸 Screenshots You Should Add (To Increase SEO + AdSense Earnings)

You can add these screenshots:

SSMS window showing query
Duplicate rows in a sample table
Execution result window

(If you want, I can generate clean screenshots for you.)

❓ FAQ – Frequently Asked Questions

1. What is the fastest way to find duplicates in SQL?

Using GROUP BY + HAVING COUNT(*) > 1 is the simplest and fastest method.

2. Does ROW_NUMBER() find duplicates?

Yes. ROW_NUMBER assigns sequence numbers to rows allowing you to identify duplicate rows easily.

3. Will deleting duplicates improve performance?

Yes.
Removing duplicates improves:

Query performance
Joins
Aggregations
ETL workflows

4. How to avoid duplicates permanently?

Use a UNIQUE constraint or PRIMARY KEY on important columns like email, mobile, employee code, etc.

5. Does this work in all SQL databases?

Yes — all examples work in:

SQL Server
MySQL
PostgreSQL
Oracle
BigQuery

⭐ Final Thoughts

Finding and removing duplicates is a critical part of database maintenance.
By using the queries shown above—GROUP BY, ROW_NUMBER, and unique constraints—you can quickly identify and eliminate duplicate data efficiently.

If you want more SQL tutorials, feel free to browse my other posts or ask in the comments!

👉 Want more articles like this for your blog?

I can write complete SEO-optimized posts on:

SQL Server
MySQL
BigQuery
Oracle
DBeaver
Power BI
ASP.NET
Data analysis topics

Just tell me your next topic!

7 Dec 2025

FULL PYTHON + MS SQL SCRIPT (BEST PRACTICE)

import pyodbc

# -----------------------------------------

# SQL CONNECTION (EDIT THIS PART)

# -----------------------------------------

def get_connection():

try:

conn = pyodbc.connect(

"DRIVER={ODBC Driver 17 for SQL Server};"

"SERVER=YOUR_SERVER_NAME;" # e.g. DESKTOP-123\SQLEXPRESS

"DATABASE=YOUR_DATABASE_NAME;"

"UID=YOUR_USERNAME;"

"PWD=YOUR_PASSWORD;",

autocommit=False # Manual commit = safer

)

return conn

except Exception as ex:

print("Database connection failed:", ex)

return None

# -----------------------------------------

# INSERT RECORD

# -----------------------------------------

def insert_employee(emp_id, name, salary):

conn = get_connection()

if not conn:

return

try:

cursor = conn.cursor()

query = """

INSERT INTO Employees (EmpID, Name, Salary)

VALUES (?, ?, ?)

"""

cursor.execute(query, (emp_id, name, salary))

conn.commit()

print("Record inserted successfully!")

except Exception as ex:

conn.rollback()

print("Insert failed:", ex)

finally:

conn.close()

# -----------------------------------------

# UPDATE RECORD

# -----------------------------------------

def update_employee(emp_id, new_salary):

conn = get_connection()

if not conn:

return

try:

cursor = conn.cursor()

query = """

UPDATE Employees

SET Salary = ?

WHERE EmpID = ?

"""

cursor.execute(query, (new_salary, emp_id))

conn.commit()

print("Record updated successfully!")

except Exception as ex:

conn.rollback()

print("Update failed:", ex)

finally:

conn.close()

# -----------------------------------------

# DELETE RECORD

# -----------------------------------------

def delete_employee(emp_id):

conn = get_connection()

if not conn:

return

try:

cursor = conn.cursor()

query = "DELETE FROM Employees WHERE EmpID = ?"

cursor.execute(query, (emp_id,))

conn.commit()

print("Record deleted successfully!")

except Exception as ex:

conn.rollback()

print("Delete failed:", ex)

finally:

conn.close()

# -----------------------------------------

# SELECT ALL RECORDS

# -----------------------------------------

def get_all_employees():

conn = get_connection()

if not conn:

return

try:

cursor = conn.cursor()

cursor.execute("SELECT EmpID, Name, Salary FROM Employees")

print("\n--- Employee List ---")

for row in cursor.fetchall():

print(row.EmpID, row.Name, row.Salary)

except Exception as ex:

print("Select failed:", ex)

finally:

conn.close()

# -----------------------------------------

# CALL STORED PROCEDURE

# -----------------------------------------

def call_stored_procedure():

"""

Example Stored Procedure:

CREATE PROCEDURE GetEmployees

BEGIN

SELECT * FROM Employees

END

"""

conn = get_connection()

if not conn:

return

try:

cursor = conn.cursor()

cursor.execute("{CALL GetEmployees}") # or EXEC GetEmployees

rows = cursor.fetchall()

print("\n--- Stored Procedure Result ---")

for row in rows:

print(row.EmpID, row.Name, row.Salary)

except Exception as ex:

print("Stored procedure failed:", ex)

finally:

conn.close()

# -----------------------------------------

# MAIN TESTING

# -----------------------------------------

if __name__ == "__main__":

insert_employee(1, "Vijay", 50000)

update_employee(1, 60000)

get_all_employees()

delete_employee(1)

get_all_employees()

call_stored_procedure()

🚀 How to Use

Update:


SERVER=
DATABASE=
UID=
PWD=

Create sample table in SQL:


CREATE TABLE Employees (
    EmpID INT PRIMARY KEY,
    Name VARCHAR(100),
    Salary DECIMAL(18,2)
);

Run the Python file:


python mssql_python_demo.py

MS SQL For Beginners

12 Dec 2025

Best SQL Query to Find Duplicates – Step by Step Guide

🔍 Why Finding Duplicates Is Important

✅ 1. Find Duplicates Using GROUP BY (Best & Easiest Method)

Query:

✔ When to use:

📌 Example: Find Duplicate Email IDs

Query:

Output:

🔥 3. Find Duplicates Using ROW_NUMBER() (Best for Deleting)

Query:

✔ When to use:

🗑 4. Delete Duplicate Rows (Keep Only One)

🧠 5. Find Duplicates by Date or Conditional Columns

🛑 6. Prevent Duplicates Using Constraints

In SQL Server / MySQL / PostgreSQL:

In Oracle:

📸 Screenshots You Should Add (To Increase SEO + AdSense Earnings)

❓ FAQ – Frequently Asked Questions

1. What is the fastest way to find duplicates in SQL?

2. Does ROW_NUMBER() find duplicates?

3. Will deleting duplicates improve performance?

4. How to avoid duplicates permanently?

5. Does this work in all SQL databases?

⭐ Final Thoughts

👉 Want more articles like this for your blog?

7 Dec 2025

FULL PYTHON + MS SQL SCRIPT (BEST PRACTICE)

🚀 How to Use

SQL Server 2025 Express Edition Download, Install and Configure

Followers

12 Dec 2025

Best SQL Query to Find Duplicates – Step by Step Guide

🔍 Why Finding Duplicates Is Important

✅ 1. Find Duplicates Using GROUP BY (Best & Easiest Method)

Query:

✔ When to use:

📌 Example: Find Duplicate Email IDs

Query:

Output:

🔥 3. Find Duplicates Using ROW_NUMBER() (Best for Deleting)

Query:

✔ When to use:

🗑 4. Delete Duplicate Rows (Keep Only One)

🧠 5. Find Duplicates by Date or Conditional Columns

🛑 6. Prevent Duplicates Using Constraints

In SQL Server / MySQL / PostgreSQL:

In Oracle:

📸 Screenshots You Should Add (To Increase SEO + AdSense Earnings)

❓ FAQ – Frequently Asked Questions

1. What is the fastest way to find duplicates in SQL?

2. Does ROW_NUMBER() find duplicates?

3. Will deleting duplicates improve performance?

4. How to avoid duplicates permanently?

5. Does this work in all SQL databases?

⭐ Final Thoughts

👉 Want more articles like this for your blog?

7 Dec 2025

FULL PYTHON + MS SQL SCRIPT (BEST PRACTICE)

🚀 How to Use

SQL Server 2025 Express Edition Download, Install and Configure

Subscribe To

Followers