rnew icon6Grab Deal : Flat 30% off on live classes + 2 free self-paced courses! - SCHEDULE CALL rnew icon7

What is SQL Self Join And Merge Join? Uses and Examples.

 

This blog's readers are mostly IT professionals. They all developed computer programs and compiled data into a database. Data must be retrieved at some point. If you followed the standard normalization procedure of a relational database management system, you wouldn't save all data in a single table. Thus, data must be extracted from multiple tables to generate the register's final output.

As an illustration, take the Sales Register. Any application typically stores sales data in the Sales Header or Sales Detail files. Data from the Customer master and the Item master must also be extracted. Even though each table has a primary key and a foreign key to help you connect them, extracting data from them requires a specific method or tool. This procedure or tool goes by the name Joins.

In the corresponding paragraphs, we will learn about different types of SQL joins- self and merge join- and their purpose and performance.

What is Self Join

As the name signifies, in SELF JOIN, a table is joined. Each table row is joined with itself and all other rows depending on some conditions. In other words, it is a joint between two copies of the same table. A self-join is a regular join, but the table is joined with itself.

Syntax of SQL Self Join

The Syntax for SQL Server Self Join is as below

SELECT column_name(s)
FROM table1 T1, table1 T2

WHERE condition;

T1 and T2 are different table aliases for the same table.

Use of Self-Join in SQL

SQL Self Join is helpful when you want to correlate pairs of rows from the same table, for example, in a parent-child relationship. The following query returns the names of all immediate subcategories of the category 'Kitchen.'

SELECT T2.name
FROM category T1
JOIN category T2
ON T2.parent = T1.id
WHERE T1.name = 'Kitchen'

Performance of Self Join

The result of using a self-join is that it potentially squares the number of output rows. This increase in output data can cause poor performance. Instead of using a self-join, use a window (analytic) function to reduce the number of additional bytes generated by the query.

SQL Server Self-Join Example

In this example, we will use the AdventureWorks sample database.

We will consider the Product table under the Production schema for this example.

The sql statement to view the table is below

 Select * from Production.Product

The output is as below

The query for self-join looks like this below

SELECT Color=p1.Color, ProductA=p1.Name, ProductB=p2.Name
FROM Production.Product p1
INNER JOIN Production.Product p2
ON p1.Color = p2.Color               
AND p1.Color is not null          
AND p1.ProductNumber < p2.ProductNumber
ORDER BY Color, ProductA, ProductB

 The output looks like the one below.

Advantage of SQL Server Self Join

You can use a self-join to simplify nested SQL queries where the inner and outer queries reference the same table. These joins allow you to retrieve related records from the same table. In this table, the manager attribute simply references the employee ID of another employee in the same table. Let's dive into SQL joins and learn more about their importance in SQL and key takeaways. You should check out the SQL career path guide to help you explore all your career options.

SQL Server Merge Join

The Merge Join transformation provides an output generated by joining two sorted datasets using a FULL, LEFT, or INNER join. For example, you can use a LEFT join to join a table that includes product information with a table that lists the country/region in which a product was manufactured.

Syntax of Merge Join in SQL

The syntax for merge join is as follows

MERGE Table1 t1
USING Table2 t2
ON t1.field1 = t2.field2
WHEN MATCHED THEN
UPDATE
SET t1.field2 = t1.field2 + t2.field2;

t1 and t2 are different table aliases for the two tables.

Use of SQL Server Merge Join

Merge join is used when projections of the joined tables are sorted on the join columns. Merge joins are faster and uses less memory than hash joins. Hash join is used when projections of the joined tables are not already sorted on the join columns.

Performance of Merge Join

The Merge join is the most efficient in SQL Server. In simple terms, if you see a Merge Join, your query plan is efficient, and you do not need to make many changes to improve query performance. Since the Merge Join operator uses sorted data inputs, it can use two large datasets.

SQL Server Merge Join Example

Let us create two tables called BookOrder and Bookinventory and fill them with Data. The sql query is as below.

CREATE TABLE dbo.BookInventory  -- target
(
TitleID INT NOT NULL PRIMARY KEY,
Title NVARCHAR(100) NOT NULL,
Quantity INT NOT NULL
CONSTRAINT Quantity_Default_1 DEFAULT 0
);
CREATE TABLE dbo.BookOrder  -- source
(
TitleID INT NOT NULL PRIMARY KEY,
Title NVARCHAR(100) NOT NULL,
Quantity INT NOT NULL
CONSTRAINT Quantity_Default_2 DEFAULT 0
);

INSERT BookInventory VALUES
(1, 'The Catcher in the Rye, 6),
(2, 'Pride and Prejudice,' 3),
(3, 'The Great Gatsby', 0),
(5, 'Jane Eyre', 0),
(6, 'Catch 22', 0),
(8, 'Slaughterhouse Five', 4);

INSERT BookOrder VALUES
(1, 'The Catcher in the Rye', 3),
(3, 'The Great Gatsby', 0),
(4, 'Gone with the Wind', 4),
(5, 'Jane Eyre', 5),
(7, 'Age of Innocence,’ 8);

The final output of the two tables is as below.

Select * from BookInventory.

And

Select * from BookOrder

Now let us merge the two tables. The syntax is as below.

MERGE BookInventory bi
USING BookOrder bo
ON bi.TitleID = bo.TitleID
WHEN MATCHED THEN
UPDATE
SET bi.Quantity = bi.Quantity + bo.Quantity;

The output looks like below

Advantage of SQL Server Merge Join

The merge join is the most efficient of all three operators. It combines the advantage of the hash match, where the actual data needs to be accessed only once, with the advantages of nested loops - low CPU consumption and enabling fast output of matched rows for further query processing.

Conclusion

In the above writeup, we have discussed SQL Server Self Join and Merge Join. We have learned about its advantage, disadvantages, and performance impact. This will be fascinating reading for those who want to learn about merge join and self-join in a more detailed manner. Or else you can also enroll in an online SQL server training course and learn more about SQL server and it’s various functions.

cta14 icon

SQL Testing Training

  • Personalized Free Consultation
  • Access to Our Learning Management System
  • Access to Our Course Curriculum
  • Be a Part of Our Free Demo Class

Trending Courses

Cyber Security icon

Cyber Security

  • Introduction to cybersecurity
  • Cryptography and Secure Communication 
  • Cloud Computing Architectural Framework
  • Security Architectures and Models
Cyber Security icon1

Upcoming Class

2 days 17 May 2024

QA icon

QA

  • Introduction and Software Testing
  • Software Test Life Cycle
  • Automation Testing and API Testing
  • Selenium framework development using Testing
QA icon1

Upcoming Class

5 days 20 May 2024

Salesforce icon

Salesforce

  • Salesforce Configuration Introduction
  • Security & Automation Process
  • Sales & Service Cloud
  • Apex Programming, SOQL & SOSL
Salesforce icon1

Upcoming Class

2 days 17 May 2024

Business Analyst icon

Business Analyst

  • BA & Stakeholders Overview
  • BPMN, Requirement Elicitation
  • BA Tools & Design Documents
  • Enterprise Analysis, Agile & Scrum
Business Analyst icon1

Upcoming Class

16 days 31 May 2024

MS SQL Server icon

MS SQL Server

  • Introduction & Database Query
  • Programming, Indexes & System Functions
  • SSIS Package Development Procedures
  • SSRS Report Design
MS SQL Server icon1

Upcoming Class

2 days 17 May 2024

Data Science icon

Data Science

  • Data Science Introduction
  • Hadoop and Spark Overview
  • Python & Intro to R Programming
  • Machine Learning
Data Science icon1

Upcoming Class

3 days 18 May 2024

DevOps icon

DevOps

  • Intro to DevOps
  • GIT and Maven
  • Jenkins & Ansible
  • Docker and Cloud Computing
DevOps icon1

Upcoming Class

0 day 15 May 2024

Hadoop icon

Hadoop

  • Architecture, HDFS & MapReduce
  • Unix Shell & Apache Pig Installation
  • HIVE Installation & User-Defined Functions
  • SQOOP & Hbase Installation
Hadoop icon1

Upcoming Class

9 days 24 May 2024

Python icon

Python

  • Features of Python
  • Python Editors and IDEs
  • Data types and Variables
  • Python File Operation
Python icon1

Upcoming Class

10 days 25 May 2024

Artificial Intelligence icon

Artificial Intelligence

  • Components of AI
  • Categories of Machine Learning
  • Recurrent Neural Networks
  • Recurrent Neural Networks
Artificial Intelligence icon1

Upcoming Class

3 days 18 May 2024

Machine Learning icon

Machine Learning

  • Introduction to Machine Learning & Python
  • Machine Learning: Supervised Learning
  • Machine Learning: Unsupervised Learning
Machine Learning icon1

Upcoming Class

16 days 31 May 2024

 Tableau icon

Tableau

  • Introduction to Tableau Desktop
  • Data Transformation Methods
  • Configuring tableau server
  • Integration with R & Hadoop
 Tableau icon1

Upcoming Class

9 days 24 May 2024