PostgreSQL

Common Table Expressions in PostgreSQL

This article is part of our Academy Course titled PostgreSQL Database Tutorial.

In this course, we provide a compilation of PostgreSQL tutorials that will help you set up and run your own database management system. We cover a wide range of topics, from installation and configuration, to custom commands and datatypes. With our straightforward tutorials, you will be able to get your own projects up and running in minimum time. Check it out here!

In our previous article we explained how to use JOINs to create more advanced SELECT queries. However, there are instances when using this technique to retrieve data from two or more tables does not satisfy our requirements or makes the query difficult to read – for example, if we need several JOINs or a subquery to return the desired information.

To solve this, standard SQL (note that this is not something exclusive to PostgreSQL), introduced the concept of Common Table Expressions (best known as CTE for short) in order to simplify this type of queries. In this article we will explain what CTEs are and how to use them.

1.Definition of Common Table Expressions (CTE)

Formally speaking, a CTE is a temporary result set that is created through the use of a WITH clause and is valid only during the execution of a given query. Another distinguishing feature of a CTE is that it can either reference itself (recursive CTE) or not (non-recursive CTE), providing the flexibility that common queries do not provide. A recursive CTE is often used when a calculation needs to be reported as part of the final result set, whereas a non-recursive one is usually utilized for a regular query. Additionally, its definition -meaning the fields it returns- is not stored as a separate database object.

Although Common Table Expressions can be used in SELECT, INSERT, UPDATE, or DELETE operations, we will only use the first type as it is the easiest to understand. Once you feel comfortable with using CTEs that involve SELECTs only, refer to the official PostgreSQL 9.5 documentation to learn how to use them with the other operation types.

All of these new concepts will better sink in as we illustrate them through examples, so let’s begin.

2.Non-recursive Common Table Expressions

As usual, we will use the World_db database we installed in the first article of this series. To begin, let’s consider the following query:

SELECT A.name "City", A.district "District",
B.name "Country", C.language "Language"
FROM city A JOIN country B ON A.countrycode=B.code
JOIN countrylanguage C ON A.countrycode=C.countrycode
WHERE A.name='Rosario' AND C.isofficial='TRUE';

As you can probably guess by now, it will return the city name, the district, the country, and the official language where the city name is Rosario. If you look carefully, this query uses 2 JOINs – not a bad thing in itself, but the readability certainly could use some improvements.

Our first example of a Common Table Expression will be rather basic but does the job of introducing the concept:

WITH t AS (
SELECT A.name City, A.district District,
A.countrycode CountryCode, B.name Country
FROM city A JOIN country B ON A.countrycode=B.code)
SELECT t.City, t.District, t.Country, C.language
FROM t JOIN countrylanguage C on t.CountryCode = C.countrycode
WHERE t.City='Rosario' AND C.isofficial='TRUE';

Before we go into PostgreSQL and run the above query, let’s split it into two parts to explain what is happening.

Step 1 – Define the CTE using the WITH clause. For simplicity, we will name the CTE as t, but you can use other name if you want.

WITH t AS (
SELECT A.name City, A.district District,
A.countrycode CountryCode, B.name Country
FROM city A JOIN country B ON A.countrycode=B.code)

If we were to do a SELECT * FROM t; at this point, we would get all the cities with their corresponding district and country. You may well be saying to yourself, “Then I don’t see what’s the point in using CTEs” – but wait, Step 2 will shed some light on the why.

Step 2 – Select the fields from the CTE and perform a JOIN with another table. As the CTE can be considered a temporary result set, we can perform JOINs on other tables. However, in this case we can use the more descriptive names given by the CTE instead of the original table names (are you seeing the readability improvements already?). Since both the city and country tables contain a field called name, the CTE allows us to refer to the city and country names as City and Country instead.

SELECT t.City, t.District, t.Country, C.language
FROM t JOIN countrylanguage C on t.CountryCode = C.countrycode
WHERE t.City='Rosario' AND C.isofficial='TRUE';

As you can see in Fig. 1, the result is identical to the original query:

A non-recursive Common Table Expression
Figure 1 – A non-recursive Common Table Expression

Recursive Common Table Expressions

A recursive CTE references itself usually via a WITH clause referring to its own output. To better illustrate through an example, we are going to create a new database and table named College and CollegeClasses, respectively, and populate the former with dummy data as follows:

CREATE TABLE CollegeClasses (
ClassID serial PRIMARY KEY,
ClassDescription VARCHAR NOT NULL,
ClassParentID INT
);

INSERT INTO CollegeClasses (
ClassDescription,
ClassParentID
)
VALUES
('Calculus 1', NULL),
('Algebra 1', 1),
('Analytic Geometry', 1),
('Physics 1', 1),
('Statistics', 1),
('Algebra 2', 2),
('Discrete Math', 2),
('Programming 1', 2),
('Programming 2', 2),
('Advanced Geometry', 3),
('Control systems', 3),
('English as a Second Language 1', 3),
('Literature', 3),
('Physics 2', 4),
('Calculus 2', 4),
('Graphs and Math', 7),
('English as a Second Language 2', 7),
('Basic algorithms', 8),
('Advanced algorithms', 8),
('Programming with C', 8);

In this case we’re interested in retrieving a list of classes and their children down to a given level. For example, we will start with Algebra 1 (ClassID=2) and descend down to the last class that depends on it:

WITH RECURSIVE classes AS (
SELECT
ClassID,
ClassParentID,
ClassDescription
FROM
CollegeClasses
WHERE
ClassID = 2
UNION
SELECT
e.ClassID,
e.ClassParentID,
e.ClassDescription
FROM
CollegeClasses e
INNER JOIN classes s ON s.ClassID = e.ClassParentID
) SELECT * FROM classes;

This query, as in the previous section, deserves a detailed explanation. Let’s begin by saying a recursive CTE consists of 4 components:

#1 – A non-recursive query. In this case, it is a query to retrieve the CollegeClass information where ClassID=2:

SELECT ClassID, ClassParentID, ClassDescription
FROM CollegeClasses WHERE ClassID = 2

#2 – The UNION or UNION ALL operator. Any of these operators allows us to combine one or more result sets into a single one. The choice of one above the other will depend on whether you want to avoid duplicates (if any) or return them, respectively.

#3 – The recursive term. Note that the classes temporary table references itself in this part of the CTE:

SELECT e.ClassID, e.ClassParentID, e.ClassDescription
FROM CollegeClasses e INNER JOIN classes s ON s.ClassID = e.ClassParentID

#4 – The final statement, which is executed once the iterations in Part 3 have finished. In this case,

SELECT * FROM classes;

That said, let’s take a look at the result of the query (see Fig. 2) and examine it to see if it meets our expectations:

A recursive CTE
Figure 2 – A recursive CTE

As we can see, the result set begins with ClassID=2, and shows all its children (ClassID=6, 7, 8, and 9). Then it shows all the children of this last set as well.

You will want to use a recursive CTE if you need to retrieve information in the form of a hierarchical tree. It is precisely the keyword RECURSIVE at the top of the query which allows the CTE to reference itself.

3.Summary

In this article we have explained how to create recursive and non-recursive Common Table Expressions in PostgreSQL. As you pursue the study of this topic, keep in mind that using CTEs is not a matter of improving performance, but readability and maintainability.

Hope it helps!

Gabriel Canepa

Gabriel Canepa is a Linux Foundation Certified System Administrator (LFCS-1500-0576-0100) and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work. When he's not typing commands or writing code or articles, he enjoys telling bedtime stories with his wife to his two little daughters and playing with them, the great pleasure of his life.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button