Thursday, October 22, 2020

SQL

How to use the GROUP BY clause or ROW_NUMBER() function to find duplicate values in SQL Server. 
The query for finding the duplicate values in multiple columns using the 
GROUP BY clause :

SELECT  col1,col2,...
COUNT(*)
FROM table_name
GROUP BY col1,col2,...
HAVING COUNT(*) > 1;

 
Using ROW_NUMBER() function to find duplicates in a table

The following statement uses the ROW_NUMBER() function to find duplicate rows based on both a and b columns:WITH cte AS (SELECTa,b,ROW_NUMBER() OVER (PARTITION BY a,bORDER BY a
,b) rownumFROMt1)SELECT*FROMcteWHERErownum > 1;  WITH cte AS (SELECTcol,ROW_NUMBER() OVER (PARTITION BY colORDER BY col) row_numFROMt1)SELECT * FROM cteWHERE row_num > 1;

Here is the result:

How it works:

First, the ROW_NUMBER() distributes rows of the t1 table into partitions by values in the a and b columns. The duplicate rows will have repeated values in the a and b columns, but different row numbers as shown in the following picture:

Second, the outer query removes the first row in each group.

Generally, This statement uses the ROW_NUMBER() function to find the duplicate values in one column of a table:

Wednesday, September 16, 2020

Sql Function

 

FUNCTIONS:

It is a predefined program which returns one value, returned value can be of numeric, string data,

data type functions can be used for calculations

SINGLE ROW FUNCTION: This function will process one row at a time and returns one value.

Mathematical Functions

String Functions

Date & Time Functions

Data Conversion {Type Casting Functions}

Ranking Functions {Windows Functions}

Meta Data Functions

Other Functions

MULTIPLE ROW FUNCTION: This function will process multiple rows at a time and returns one value.

Aggregate Functions

SQL SERVER supports to work with 12 categories of Functions and they are found at following paths

Click on Object Explorer – Data Bases

Select the Data Base - TESTDB

Check under Programmability Functions.

Check under System Functions.

MATHEMATICAL FUNCTIONS:

This function will take input as numbers and will return output as numbers

ABS (NUMBER) - returns unsigned value of a given number (Number - Argument or Parameter.

SELECT ABS(-890), ABS(17)

SQRT (NUMBER) - Returns square root of a given positive number.

SQUARE (NUMBER) - Returns square of a given value

POWER ( NUMBER(Base), NUMBER(Exponent)) - It will find the power of a given number.

SELECT POWER (2,5)

SIGN ( NUMBER )

Returns 1 if a number is positive

Returns -1 if a number is negative

Returns 0 if a number is zero

PI() - Returns PI value

SIN (NUMBER) - By default this function will take input given in radians, hence radians should be

converted to degrees by a standard formula is PI ()/180

SELECT SIN (30* PI()/180)

Built In Functions

Or

Predefined Functions

Table Valued

Functions

Scalar

Functions

Multiple Row Functions

Or

Group Functions

Single Row Functions

Or

Individual Functions

User Defined Functions

Or

Stored Functions

**Functions used with in other function refers to FUNCTION CASCADING

ROUND ( NUMBER, NUMBER , [NUMBER])

Argument1, Argument2, Argument3 (optional for TRUNCATE)

By default Argument3 is ZERO

ROUND(1234.5678,2,1) --- 1234.5600

ROUND(1234.5678,2) ----- 1234.5700

ROUND(1234.5638,2) ----- 1234.5600

ROUND(1234.5678,1) ----- 1234.6000

ROUND(1234.5678,0) ----- 1235.0000

ROUND(1234.2678,0) ----- 1234.0000

ROUND(1234.5678,-2)----- 1200.0000

ROUND(5678.123,-2) ----- 5700.0000

ROUND(1234.5678,2,1) --- 1234.5600

ROUND(1234.5678,1,1) --- 1234.5000

ROUND(1234.5678,0,1) --- 1234.0000

ROUND(5/2.0,0) --- 3.00

ROUND(5/2.0,0,1) --- 2

This function is used to round a given number with respect to Integer value or Decimal value based

on user required number of digits. If 3rd argument is specified other than zero it works as TRUNCATE.

CEILING(NUMBER) – This function will increment a given number to its nearest integer. Based on

any digit in decimal points is greater than zero.

CEILING (123.000) ---- 123

CEILING (123.010) ---- 124

CEILING (123.456) ---- 124

FLOOR (NUMBER) – It decreases a nearest integer

FLOOR (-123.456) ---- {-124}

STRING FUNCTIONS: This are used to perform different operations on STRING DATA TYPE.

LEN(TEXT) - Returns number of characters.

SELECT ENAME, LEN(ENAME) FROM EMP WHERE LEN(ENAME) > 5

LOWER(STRING) – Converts characters to Lower case.

UPPER(STRING) – Converts characters to Upper case.

ASCII(STRING) – returns ASCII code of the given character

ASCII('A')----65

ASCII('P')----80

UNICODE( STRING ) – Returns UNICODE of a given string.

CHAR ( NUMBER ) – Returns character of a given ASCII code

CHAR (65) - A { Only 0-255 characters are available}

NCHAR(NUMBER)

NCHAR(256) {1-65535 Characters are available}

LEFT(STRING,NUMBER) – Extracts number of characters from left side of a string

LEFT('COMPUTER',3) ---- COM

RIGHT( STRING, NUMBER) – Extracts number of characters from right side of a string

RIGHT('COMPUTER',3)-----TER

SUBSTRING (STRING, STARTPOSITION, NUMBER OF CHAR’S) – It extracts required string

from a given string based on start position and number of characters.

SUBSTRING ('COMPUTER',4,3)—PUT

LTRIM(STRING) - Removes blanks spaces from left side of a string.

RTRIM(STRING)- Removes blank spaces from right side of a string.

REPLICATE(STRING, NUMBER) - Displays a given string for n times.

REPLICATE('COMM',2)

REVERSE(STRING) - It will reverse a given string.

SELECT REVERSE('COMM')--- MMOC

REPLACE (STRING1, STRING2, STRING3)

String1 ---- given string, String2 ---- search string, String3 ---- replace string

REPLACE ('COMPUTER','UT','IR')

It will search for String2 in String1 and replaces with String3 at all occurrences of String 2 in String1

STUFF(STRING1, NUMBER1, NUMBER2, STRING2)

String1 --- Given String, Number 1--- Start Position, Number2 --- Number of Characters

String2 --- Replace

STUFF('COMPUTER',5,2,'IL') -- COMPILER

CHARINDEX(STRING1, STRING2, [NUMBER])

String1 --- Search String, String2 --- Given String, Number --- Position

It will search for string1 in string2 and returns position of string1, if it exists else returns zero, by

default it will search from first character. It also supports to search for character/string after a

specific position

SELECT CHARINDEX('R','COMPUTER',7)

SELECT REPLICATE ('*', GRADE) FROM SALGRADE

Q) Display all the employees from the EMP table with half of the string in upper case and next string

in lower case

SELECT LEFT (ENAME, CEILING (LEN (ENAME)/2.0)) +

LOWER(RIGHT (ENAME, CEILING (LEN (ENAME)/2))) FROM EMP

Q) Display all Employees whose name contains 3rd Character as ‘L’.

SELECT ENAME FROM EMP WHERE SUBSTRING (ENAME, 3,1)= 'L'

SELECT ENAME FROM EMP WHERE CHARINDEX ('L', ENAME, 3) = 3

Q) Display all Employees whose name contains A at second occurance.

SELECT ENAME FROM EMP

WHERE CHARINDEX ('A', ENAME, CHARINDEX ('A', ENAME) +1 !=0

DATE & TIME FUNCTIONS:

These functions are used to perform operations on date, time data type. When date type data is

retrieved it will be in the following format.

CENTURY: YEAR: MONTH: DATE: HOURS: MINUTES: SECONDS

GETDATE(): It returns current system DATE & TIME of the Server.

SELECT GETDATE()

ARITHMETIC OPERATIONS on data & time data type “+”, “-“ are the operations which can be used

for performing operations on date type data.

DATE + NUMBER = DATE

DATE – NUMBER = DATE

DATE – DATE = NUMBER [Number of days]

SELECT HIREDATE, HIREDATE+15, HIREDATE-28 FROM EMP

DAY (DATE) -------- extracts days from date

MONTH (DATE) ----- extracts months from date

YEAR (DATE) ------ extracts year from date

SELECT DAY(GETDATE())

SELECT MONTH(GETDATE())

SELECT YEAR(GETDATE())

DATEADD(DATEPART , NUMBER , DATE) - It is used to add number of days, months, years to a

given date.

DATEPART can be specified with this format - DD: MM: YY: HH: MI: SS: DW

SELECT DATEADD(DAY,10,GETDATE())

SELECT DATEADD(DD,10,GETDATE())

SELECT DATEADD(MONTH,10,GETDATE())

SELECT DATEADD(MM,10,GETDATE())

SELECT DATEADD(YEARS,10,GETDATE())

SELECT DATEADD(YY,10,GETDATE())

DATEDIFF( DATEPART, STARTDATE, ENDDATE) - It returns difference between 2 dates in

terms of days, months & years.

SELECT ENAME, DATEDIFF(DD,HIREDATE,GETDATE()) DAYS FROM EMP

DATEPART(DATEPART,DATE)- It extracts individual parts of a table.

DATEPART(DD,GETDATE()) – MM: YY: HH: MI: SS: DW

DATENAME(DATEPART,DATE) - In this function month name, day name will be extracted other

date parts providing same output.

DATENAME(MM,GETDATE()) ---- JUNE(Based on the current server date)

DATENAME(DW,GETDATE()) ---- MON

Q) Display all those Employees data that have been hired on Sunday.

SELECT ENAME, HIREDATE FROM EMP WHERE DATENAME(DW,HIREDATE) = 'SUNDAY'

Q) Display all those Employees who have been hired in the month of December.

SELECT ENAME,HIREDATE FROM EMP WHERE DATENAME(MM,HIREDATE)= 'DECEMBER'

Q) Display all those Employees who have been hired in second half of the week.

SELECT ENAME, HIREDATE FROM EMP

WHERE DATEPART (DW, HIREDATE) > = ROUND(7/2.0,0)

DATA CONVERSION:

This function will support to convert the data stored with one type to the other type.

SQL SERVER will convert the data with the support of following two functions.

CAST(SOURCE_DATA AS TARGET_DATA_TYPE)

CONVERT(TARGET_DATA_TYPE, SOURCE_DATA [NUMBER])

SELECT ENAME + ' WORKING AS ' + JOB + ' IN DEPT ' + CAST(DEPTNO AS

VARCHAR(3)) FROM EMP

** String and String can only be concatenated.

** String and Integer cannot be concatenated.

SELECT 5/ CAST(2 AS FLOAT)

SELECT 5/ CONVERT(FLOAT,2)

SELECT CONVERT(VARCHAR(30), GETDATE(),0)

SELECT CONVERT(VARCHAR(30), GETDATE(),8) for time

** 0 stands for Date formats, there are more than 100 Date formats.

** 8 & 108 are particularly assigned for time format

RANKING OR WINDOW FUNCTIONS

These functions have been introduced from SQL 2005.

This are used to calculate ranks on any type of data.

These functions require a support of OVER clause, which requires a specification of ORDER

BY clause, followed by ORDER type.

This function cannot be used at WHERE clause.

SQL SERVER will support to work with following Rank functions

ROW_NUMBER() – Provides unique row number for each row

RANK() – It calculates ranks with gaps

DENSE_RANK() – It calculates ranks with out gaps

NTILE(INT) – Provides the data in to groups / blocks. It will group the data based on the user

specified number and number of rows available in a table.

SELECT EMPNO, ENAME, SAL, ROW_NUMBER() OVER ( ORDER BY EMPNO ASC) RN

FROM EMP

SELECT EMPNO, ENAME, SAL, RANK() OVER(ORDER BY SAL DESC) RNK FROM EMP

SELECT EMPNO, ENAME, SAL, DENSE_RANK() OVER(ORDER BY SAL DESC) RNK

FROM EMP

SELECT EMPNO, ENAME, SAL, NTILE(4) OVER( ORDER BY EMPNO) NT FROM EMP

**The data will be grouped into four groups based on given NTILE number.

OTHER FUNCTIONS

ISNULL(EXPRESSION1, EXPRESSION2)

This Function will search for NULL values in Expression1 and Substitutes Expression2 at all the rows

of Expression1 where NULL values are found, Expression2 is depended on data type of Expression1.

SELECT ENAME, SAL, COMM, ISNULL(COMM,0) RS FROM EMP

SELECT ENAME, SAL, COMM, ISNULL(ENAME,'UNKNOWN') UN FROM EMP

SELECT ENAME ,SAL, COMM, ISNULL ( CAST(COMM AS VARCHAR(4),'NC') RS

FROM EMP

SELECT ENAME,SAL,COMM,SAL+ISNULL(COMM,0) RS FROM EMP

COALESCE(EXPRESSION1,EXPRESSION2……EXPRESSIONN) - It will search for NULL values and

Substitutes Expression2 at all rows of Expression1 where NULL values are found. It Substitutes

Expression3 if Expression2 contains NULL values so on it can have any number of expressions.

SELECT EMPNO,ENAME,MGR,SAL,COMM COALESCE(COMM,SAL,MGR,EMPNO,25) RS

FROM EMP

NULLIF( EXPRESSION1, EXPRESSION2) - It is used to compare two expressions of any data

type. If equal it returns NULL, if not equal returns value of Expression1

SELECT NULLIF(10,10) -----NULL

SELECT NULLIF(10,90) -----10

SELECT NULLIF(‘X’,’Y’)--- X

SELECT ENAME,LEN(ENAME)L1,JOB,LEN(JOB)L2 NULLIF(LEN(ENAME),LEN(JOB))RES

FROM EMP

CASE EXPRESSION

It supports to specify multiple conditions provided with their respective results. If no condition is

satisfied then it will provide default result.

Syntax:

CASE EXPRESSION

WHEN CONDITION1 THEN RESULT1

[ WHEN CONDITION2 THEN RESULT2 ]

ELSE

DEFAULT_RESULT

END [ALIAS_NAME]

Q) Display ENAME, SAL and INCREMENT the salary with difference %’s based on category of JOB.

Manager – 18%, Clerk – 15%, Others – 12%

SELECT ENAME,SAL,JOB,

CASE JOB

WHEN 'MANAGER' THEN SAL*18/100

WHEN 'CLERK' THEN SAL * 15/100

ELSE

SAL*12/100

END INCR

FROM EMP

Q) Display ENAME, DEPTNO, SAL and INCREMENT the Salary with different %’s on category of Dept

No 10 – 20%, Dept no 20 – 18%, Dept no 30 – 15%

SELECT ENAME, DEPTNO, SAL,

CASE DEPTNO

WHEN 10 THEN SAL* 20/100

WHEN 20 THEN SAL * 18/100

WHEN 30 THEN SAL*15/100

END INCR

FROM EMP

MULTIROW FUNCTIONS OR GROUP FUNCTIONS

AGGREGATE FUNCTIONS:

This function will process multiple rows at a time and returns one value.

SQL SERVER supports to work with following functions

SUM (EXPRESSION) - finds the sum of values in given expression.

AVG (EXPRESSION) - first finds the sum and then divide with number of values in the

expression.

MAX (EXPRESSION) - finds the maximum value in the given expression.

MIN (EXPRESSION) - finds the minimum value in the given expression.

COUNT (EXPRESSION) --- returns number of values in a expression including duplicates.

COUNT (DISTINCT (EXPRESSION) - returns number of values in an expression excluding

duplicates.

COIUNT (*) - returns number of rows.

COUNT_BIG (EXPRESSION) - returns number of values.

GROUPING (EXPRESSION) - returns zero or one.

Returns ZERO if there is no group

Returns ONE if there is a group data

POINTS TO BE REMEMBERED

Aggregate Function will exclude NULL values.

When a SELECT statement is enclosed with Aggregate Functions, along with them writing

expressions that returns multiple rows is invalid.

Aggregate functions cannot be used in where clause.

In SQL SERVER Aggregate Functions cannot be cascaded.

SELECT SUM(SAL) FROM EMP

SELECT AVG(SAL) FROM EMP

SELECT SUM(COMM) R1, AVG(COMM) R2, COUNT(COMM)R3 FROM EMP

SELECT SUM(SAL), MAX(SAL) FROM EMP WHERE DEPTNO = 30

SELECT COUNT(SAL) FROM EMP

SELECT COUNT(DISTINCT(SAL) FROM EMP

SELECT COUNT(*) FROM EMP

To Count the Records

SP_SPACEUSED EMP

SELECT @@ROWCOUNT

SELECT SUM(SAL) , COUNT(*) FROM EMP WHERE JOB = 'SALESMAN'

SELECT DEPTNO,SUM(SAL),COUNT(*) FROM EMP GROUP BY DEPTNO

SELECT DEPTNO, JOB,SUM(SAL), COUNT(*) FROM EMP GROUP BY JOB, DEPTNO

GROUP BY: This clause performs two tasks

Eliminates duplicates and always arranges the data in ascending order.

The column used in group by clause, the same column can be used for displaying the output.

When multiple columns are used it will eliminate if all column contains duplicates.

SELECT SUM(SAL), COUNT(*) FROM EMP GROUP BY DEPTNO

SELECT DEPTNO, SUM(SAL), COUNT(*) FROM EMP GROUP BY DEPTNO

SELECT JOB, SUM(SAL), COUNT(*) FROM EMP GROUP BY JOB

SELECT MGR, COUNT(*) FROM EMP GROUP BY MGR

SELECT DEPTNO, JOB, SUM(SAL), COUNT(*) FROM EMP GROUP BY JOB , DEPTNO

Q) Display DEPTNO, total salary and number of employee’s related to Dept No 20.

SELECT DEPTNO, SUM(SAL), COUNT(*) FROM EMP

WHERE DEPTNO = 20 GROUP BY DEPTNO

ROLLUP AND CUBE OPERATORS

ROLLUP:

This operator should be used only at GROUP BY clause.

It will calculate individual totals, group totals, and additional row is generated for grand

totals.

SELECT DEPTNO, SUM(SAL), COUNT(*) FROM EMP GROUP BY DEPTNO WITH ROLLUP

SELECT DEPTNO, JOB, SUM(SAL), COUNT(*) FROM EMP

GROUP BY DEPTNO, JOB WITH ROLLUP

SELECT DEPTNO, JOB, SUM(SAL),COUNT(*) , GROUPING(JOB) FROM EMP

GROUP BY DEPTNO, JOB WITH ROLLUP

SELECT DEPTNO ,

CASE GROUPING(JOB)

WHEN 0 THEN JOB

WHEN 1 THEN ALL JOBS” -- Replaces NULLS With The Value

END DEP, SUM(SAL), COUNT(*)

FROM EMP GROUP BY DEPTNO, JOB WITH ROLLUP

CUBE:

This operator is also used only at GROUP BY clause.

It will support to perform individual totals, group totals, regroup totals and additional row is

generated for grand totals.

SELECT DEPTNO, JOB, SUM(SAL), COUNT(*) FROM EMP

GROUP BY DEPTNO, JOB WITH CUBE

Q) Display total salaries of individual departments in a single row.

SELECT

SUM (CASE DEPTNO WHEN 10 THEN SAL END) “D10”,

SUM (CASE DEPTNO WHEN 20 THEN SAL END) “D20”,

SUM (CASE DEPTNO WHEN 30 THEN SAL END) “D30”

FROM EMP

SELECT DISTINCT JOB,

SUM (CASE DEPTNO WHEN 10 THEN SAL END) “D10”,

SUM (CASE DEPTNO WHEN 20 THEN SAL END) “D20”,

SUM (CASE DEPTNO WHEN 30 THEN SAL END) “D30”,

SUM (SAL) “TOTAL SAL”

FROM EMP GROUP BY JOB

 


What is Normalization?

A: Database normalization is the process of designing the database in such a way that it reduces data redundancy without sacrificing integrity.

What Does a Data Modeler Use Normalization For?

A: The purposes of normalization are:

  • Remove useless or redundant data
  • Reduce data complexity
  • Ensure relationships between the tables in addition to the data residing in the tables
  • Ensure data dependencies and that the data is stored logically.

So, What is Denormalization, and What is its Purpose?

A: Denormalization is a technique where redundant data is added to an already normalized database. The procedure enhances read performance by sacrificing write performance.

SQL

How to use the  GROUP BY  clause or  ROW_NUMBER()  function to find duplicate values in SQL Server.  The query for finding the duplicate va...