SQL Joins & Subquery

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Is it possible to use Joins,

Subqueries, and Unions in one


Query?
A reader recently asked me to help them solve a problem combining data from multiple tables into
a single result. What was interesting was that the solution used the three ways to combine data
that we have recently covered, namely: JOINS, set operators, and subqueries.
Given the comprehensive nature of this problem I thought it would be a great case study for you to
read.
NOTE: This example is not based on Microsoft SQL Server Management Studio and the
AdventureWorks2012 database. However, I strongly recommend you get the tool and start using
it. Many of the lessons and articles use it. You can get started using these free tools using
my Guide Getting Started Using SQL Server.

The questions asked was:

“I have a main data table that lists events. Each event has a series of fields including duration,
reason code and status.
The database has a table of statuses and there is a common field between the main table and this
allows me to show the status name rather than the code.
There are four statuses (ready, delay, spare and down) and there are a range of reasons
associated with each status. It is possible for the same code to exist in two statuses (e.g. delay
maintenance and down maintenance).
For some reason the source database has four tables of reasons (one for each status). Each of
these has a field which links to the main reason code. What is missing is a status field in those
tables.
What I’m looking for is something along the lines of ‘’when the main table record status is ‘delay’
then retrieve the name in the ‘delay’ table which corresponds to the main table delay code. When
the status is ‘down’, do the same thing but look up the name in the ‘down’ table.
Any pointers (if you can make any sense of the problem description) are gratefully received.”
So the goal is to create a result that uses descriptive values rather than codes. Let’s walk through
the solution together.
Case Study Goal

Here is an overall view of the various tables and the matching challenge, which is to “conditionally
match” on one of the four “ready” tables depending on the value of Status:

Challenges Matching “Conditionally”

Getting the status name is pretty straight forward. For this we can use an INNER JOIN; however,
displaying the Reason Name is tougher.
Once you put your mind to it, you’ll realize that that answer hinges around getting a better grip on
the various “reason” tables. If they could be treated uniformly, then we could simply join to them
once, avoid duplicate entries, and solve the problem.

When I originally thought about this, I was thinking about seeing if I could use SELECT CASE to
solve the problem, and then it hit me! We can do this using UNIONS. If you notice all the tables
have a very similar structure.
Given this, we can create a derived table as the result of combining the four “reason” tables which can then
be joined to the original table using both the status code and ready code. This avoids having to
“conditionally match” to each of four tables.
In my mind the answer is broken up into three distinct sub solutions:

1. Joining to the Status table to get the status name.


2. Using UNION to combine the various “reason” tables; therefore, avoiding the “conditional match”
issue.
3. Incorporating the result from step two into the final query via a subquery (derived table).
Let’s go through each of these sub problems in order.

1. Using INNER JOIN to return StatusName


Since we need to obtain the StatusName from the status table corresponding to the status code in
events we can use an INNER JOIN. As you may recall, an Inner join returns rows where the join
condition is met. This is the most common join.
In its most common example, where a primary key is being matched to a foreign key, only
combinations of rows matching the keys are returned.
This is exactly what you have. The StatusCode is the primary key of the status table and we can
use that to match that to the foreign key value StatusCode from Event.

Since status code is unique within the status table, you don’t have to worry about multiple rows
being returned by the join.
The following diagram summarizes what is being done:
Inner Join to Status Table

Here is the SQL statement used to obtain the status name, I colored the INNER JOIN parts in red.

SELECT EventID,

Duration,

S.Name

FROM Event E

INNER JOIN Status S


ON E.StatusCode = S.StatusCode
Of course, there is more to do, we also need to get the reason name, but this is a good start.

2. Utilizing UNION to Combine Reason Table


Rows
UNION is called a set operator. The UNION operator is used to combine rows from several tables
into a single result. Whereas a join is meant to combine columns from different tables into a single
row the UNION operator is adding rows from each table.
In the following diagram you can see the four separate “reason” tables.
Using UNION to Combine Reason Data

By successfully applying union clauses we’ll combine all the rows. The other thing you’ll notice is
we’re adding a static column which corresponds to the status code. Static column is just a fancy
way of saying we’re adding a column that is a fixed value.

By doing this we can associate each table’s rows with the proper status code. Remember, this
was one of our big hurdles; we didn’t have a good way to conditionally match to each of the four
tables based on status code.

By adding the status code to the result, we can eventually use both the status and reason codes to
match to the correct row.

Here is the code used to create the unified results.

SELECT 'R', ReasonCode, Name

FROM ReadyReason

UNION

SELECT 'D1', ReasonCode, Name

FROM DelayReason

UNION

SELECT 'S', ReasonCode, Name

FROM SpareReason
UNION

SELECT 'D2', ReasonCode, Name

FROM DownReason

3. Using Subqueries to Include a Derived Table


in Final Result
Once the union is created we are now able to use it to match and pull in the reason names. As
you can see in the diagram below, the matching becomes much easier. We no longer have to
inspect the status code, then decide which one of the four tables to use before matching on
reason code to get the name.

Instead we can now use a standard INNER JOIN to match both the status code and reason code
to the result of the union.

But that does bring up a question. How do we introduce that result into our query?
Fortunately there is a way to do so using subqueries. A subquery is just a query defined within
another. There are many places we can use sub queries, such as in the SELECT or WHERE
clause. We can also use them in FROM clause. When used here, they typically refer to derived
tables. The result of our union is a derived table.
Derived tables are enclosed in parenthesis, like subqueries, but they are also given a name.

UNION in Derived Table

If you look closely at the SQL you see the UNION result is given the name SR.

You’ll also notice there is a column alias for the static field. This is so we can refer to it in the join
condition. It makes the SQL easier to read and follow.
In the sample below I’ve color coded the UNION green and it use in the INNER JOIN blue..

SELECT EventID,

Duration,

SR.ReasonName

FROM Event E

INNER JOIN
(
SELECT 'R' as StatusCode, ReasonCode, Name
FROM ReadyReason
UNION
SELECT 'D1', ReasonCode, Name
FROM DelayReason
UNION
SELECT 'S', ReasonCode, Name
FROM SpareReason
UNION
SELECT 'D2', ReasonCode, Name
FROM DownReason
) SR
ON E.StatusCode = SR.StatusCode AND
E.ReasonCode = SR.ReasonCode
Remember the form of an inner join is

INNER JOIN {table} ON {match conditions}

What we have don is replace {table} with our derived table.

In this case it works very well.

Final Query
To create the final result we combine the three sub solutions together. From the section above,
you can see that each means to do so is relatively simple. Sure, there is syntax to contend with,
but I think overall the ideas are straightforward.

Of course when it is glopped it all tougher it can look overwhelming, but hopefully you have seen
that if you take a step-by-step approach to building your SQL, that becomes less so.
Final query using INNER JOIN, UNION, and subquery

Here is the text for the final SQL color coded to highlight each sub solution.

The first INNER JOIN is in red, the UNION in Green, and the INNER JOIN using the derived table
in blue.

SELECT EventID,

Duration,

S.Name as [Status Name],

SR.ReasonName as [Reason Name]

FROM Event E

INNER JOIN Status S


ON E.StatusCode = S.StatusCode
INNER JOIN
(
SELECT 'R', ReasonCode, Name
FROM ReadyReason
UNION
SELECT 'D1', ReasonCode, Name
FROM DelayReason
UNION
SELECT 'S', ReasonCode, Name
FROM SpareReason
UNION
SELECT 'D2', ReasonCode, Name
FROM DownReason
) SR
ON E.StatusCode = SR.StatusCode AND
E.ReasonCode = SR.ReasonCode
This concludes the case study.
I hope you found that helpful. Did you see that, though the end result looked like a mouthful, it
was really made up from three simpler concepts?

Many of my students raise questions about this process that is how to write complex queries such
as these.

For instance they are not sure where to start. This is where I can help. I’ve got some really great
idea to help get you going with confidence so you can start writing queries your first day.

Others wonder how to put it all together. As you can see from the case study, there is a
progression. I like to take things step-by-step.

When solving complicated problems I like to build on success. Did you see how I started with a
simple solution, added a step, and continued?

If you like my style of teaching I would recommend you sign up for my newsletter below so you
don’t miss out on any future training opportunities!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy