SQL Server: Union vs Union All

Posted August 26, 2015 by Vishwanath Dalvi in Database, SQL Server

feature-sql1

Union and Union All are used to combine two or more result sets in SQL. A Union set operator is different from SQL Joins. Union combines two sets whereas SQL Joins combines two or more columns based on a matching row condition. This tutorial describes the differences between Union and Union All based on function and performance.

Union All

1. Union All is used to combine two or more result sets as one result set.
2. Union All includes duplicates from two or more result sets.

Union

1. Union is used to combine two or more result sets as one result set.
2. Union removes duplicates from two or more result sets.
3. Union also sorts result sets in ascending order.

Example: Union vs Union All

CREATE TABLE dbo.Batsman
(
PK_Batman_Id INT,
Name         VARCHAR(255)
);

CREATE TABLE dbo.Bowler
(
PK_Bowler_Id INT,
Name         VARCHAR(255)
);

Insert into dbo.Batsman VALUES (1,'Sachin Tendulkar');
Insert into dbo.Batsman VALUES (2,'Rahul Dravid');
Insert into dbo.Batsman VALUES (3,'Brian Lara');
Insert into dbo.Batsman VALUES (4,'MS Dhoni');

Insert into dbo.Bowler VALUES (1,'Sachin Tendulkar');
Insert into dbo.Bowler VALUES (2,'Rahul Dravid');

SELECT * FROM Batsman;
SELECT * FROM Bowler;

 

Input_Union_vs_Union_All

Union All Set Operation

SELECT Name as Player
FROM   dbo.Batsman
UNION ALL
SELECT Name as Player
FROM   dbo.Bowler;
Result

Player
-------------------
Sachin Tendulkar
Rahul Dravid
Brian Lara
MS Dhoni
Sachin Tendulkar
Rahul Dravid

(6 row(s) affected) 

 

The query above combines two result sets from the dbo.Batsman and dbo.Bowler tables and shows duplicate names which appear in both tables.

Union_All_SQL_Server

Union Set Operation

SELECT Name as Player
FROM   dbo.Batsman
UNION
SELECT Name as Player
FROM   dbo.Bowler;
Result

Player
-------------------
Brian Lara
MS Dhoni
Rahul Dravid
Sachin Tendulkar

(4 row(s) affected) 

 

The query above combines two result sets from the dbo.Batsman and dbo.Bowler tables without any duplicates, and the result is sorted in ascending order.

Union_SQL_Server

Performance: Union vs Union All

As you see, Union All is using 27% of batch execution and Union is using 73% batch execution.

In this case, Union All is faster than Union because Union All does not remove duplicates and sorts the result set whereas Union remove duplicates and sorts the result set (which needs more time and a performance hit). The distinct sort operator in the execution plan shows Union does the sorting of the input set.

Union_vs_Union_All_Performance

Best Practice

When we know duplicate records are not possible in one or more sets, then using Union All over Union is beneficial to avoid a performance hit.

 

About Vishwanath Dalvi

Vishwanath Dalvi is a gifted engineer and tech enthusiast. He enjoys music, magic, movies, and gaming. When not hacking around or supporting the open source community, he is trying to overcome his phobia of dogs.
View more articles by Vishwanath Dalvi

The Conversation

Follow the reactions below and share your own thoughts.