Visualization and Reporting tools Notes - By Swati Verma: October 2020

How to use the GROUP BY clause or ROW_NUMBER() function to find duplicate values in SQL Server.
The query for finding the duplicate values in multiple columns using the GROUP BY clause :

SELECT col1,col2,...
COUNT(*)
FROM table_name
GROUP BY col1,col2,...
HAVING COUNT(*) > 1;

Using ROW_NUMBER() function to find duplicates in a table

The following statement uses the ROW_NUMBER() function to find duplicate rows based on both a and b columns:WITH cte AS (SELECTa,b,ROW_NUMBER() OVER (PARTITION BY a,bORDER BY a

,b) rownumFROMt1)SELECT*FROMcteWHERErownum > 1; WITH cte AS (SELECTcol,ROW_NUMBER() OVER (PARTITION BY colORDER BY col) row_numFROMt1)SELECT * FROM cteWHERE row_num > 1;

Here is the result:

How it works:

First, the ROW_NUMBER() distributes rows of the t1 table into partitions by values in the a and b columns. The duplicate rows will have repeated values in the a and b columns, but different row numbers as shown in the following picture:

Second, the outer query removes the first row in each group.

Generally, This statement uses the ROW_NUMBER() function to find the duplicate values in one column of a table:

Visualization and Reporting tools Notes - By Swati Verma

Pages

Thursday, October 22, 2020

SQL

SQL

Report Abuse