Complex Joins
Learning Objectives
- You can extend joins to larger query patterns involving several tables.
- You can explain how row meaning and join type still matter in longer queries.
- You can debug a suspicious join result by narrowing the query one step at a time.
Once a two-table join feels manageable, the next step is to grow the query, following one relationship at a time and keeping the row meaning in mind as the query grows.
Start with the Path and Build the Query
Suppose the question is: which users submitted exercises in the spring 2026 instance of CS-A1150?
Before writing SQL, describe the path in words. Start from the course, move to the correct course instance, move to its exercises, move to submissions, and then move to users.
That gives the path:
courses -> course_instances -> exercises -> submissions -> users
This habit matters even more in long joins than in short ones. When the path is clear, the later JOIN clauses have a much better chance of being correct.
The SQL might look like this:
SELECT
c.code,
ci.term,
ci.year,
e.title AS exercise_title,
u.name
FROM courses AS c
INNER JOIN course_instances AS ci ON ci.course_id = c.id
INNER JOIN exercises AS e ON e.course_instance_id = ci.id
INNER JOIN submissions AS s ON s.exercise_id = e.id
INNER JOIN users AS u ON u.id = s.user_id
WHERE c.code = 'CS-A1150'
AND ci.term = 'spring'
AND ci.year = 2026;
Even though the query is longer, the logic is still the same as before: follow one relationship at a time.
A possible result might look like this:
| code | term | year | exercise_title | name |
|---|---|---|---|---|
| CS-A1150 | spring | 2026 | SQL Basics | Aino |
| CS-A1150 | spring | 2026 | SQL Basics | Leevi |
| CS-A1150 | spring | 2026 | Filtering Queries | Aino |
Each row represents one matched path through the query: one course instance, one exercise, one submission, and one user.
At this point, it is useful to practice that same idea directly: follow the path, add one join at a time, and keep the row meaning in mind as the query grows.
Repeated Values
Longer joins often produce repeated values.
That can feel suspicious at first, but the important question is not simply whether some value appears more than once. The more useful question is what one result row represents now.
If several users submitted the same exercise, then a query that shows both exercises and users will naturally include several rows for that exercise. The repeated-looking exercise title may be completely correct.
So the interpretation rule is simple: repeated-looking values are only a bug if the row meaning says they should not be there.
A Longer Path with a Left Join
Longer join paths do not have to use INNER JOIN all the way through. The join type should still come from the question.
Suppose the question is: show all exercises in the spring 2026 instance of CS-A1150, together with submissions if they exist.
That question still starts with a long path:
courses -> course_instances -> exercises
The last step is different, however. The question asks for all exercises, even if some of them have no submissions, so the final step should use a LEFT JOIN:
SELECT
e.title,
s.submitted_at,
s.score
FROM courses AS c
INNER JOIN course_instances AS ci ON ci.course_id = c.id
INNER JOIN exercises AS e ON e.course_instance_id = ci.id
LEFT JOIN submissions AS s ON s.exercise_id = e.id
WHERE c.code = 'CS-A1150'
AND ci.term = 'spring'
AND ci.year = 2026;
Longer query paths are still built one relationship at a time. The path tells us which tables belong in the query, and the wording of the question tells us whether each step should require a match or keep unmatched rows visible.
A Small Self-Join
Sometimes a table must be joined to itself.
For example, imagine that courses stores a prerequisite through prerequisite_course_id, which points back to courses.id.
Then one query can read the table in two roles:
SELECT
c.code,
p.code AS prerequisite_code
FROM courses AS c
INNER JOIN courses AS p ON c.prerequisite_course_id = p.id;
The main idea is simple: a join is about a relationship path, not about whether the two roles happen to come from tables with different names.
Aliases are important here because the query needs one name for the course itself and another name for the prerequisite course. In other words, the aliases make it possible to say clearly which role each copy of the table is playing.
Debugging Suspicious Results
Longer joins also make more room for mistakes. Before worrying about syntax, a useful first question is the one from earlier in the chapter: what does one result row represent?
If the answer is clear but the rows on screen do not match it, a short debugging habit helps.
A useful order is to simplify back toward something known to work:
- keep only the first one or two joins,
- run the shortened query and inspect the rows,
- add one join back at a time,
- at each step, check whether the row count or row shape changes in a way you did not expect.
Different symptoms usually point to different causes:
- Far too many rows: often a missing or wrong
ONcondition, which produces something close to a Cartesian product. - Rows that should not appear together: usually a wrong
ONcondition that connects tables by the wrong columns. - Missing rows that should be visible: often an
INNER JOINwhere aLEFT JOINwas needed, or aWHEREcondition that accidentally filters out rows withNULLon one side. - Repeated rows that look like duplicates: often correct and explained by the row meaning, but worth confirming against the schema and the data.
The general pattern is the same as in the debugging chapter of Part 2: start small, compare expectation with result, and add complexity back one step at a time. Joins only make that habit more important.
Check Your Understanding
- Why is it useful to describe the join path before writing a long query?
- Why are repeated-looking rows sometimes correct in a multi-table join?
- Why might the last step in a longer query use
LEFT JOINinstead ofINNER JOIN?
SQL Programming Exercise
The following SQL sequence practices longer relationship paths, one longer query that ends with LEFT JOIN, and one self-join.