Summaries and Derived Data

From Design Questions to Summary Questions

Learning Objectives

You can distinguish row questions from summary questions in a list of expected application queries.
You can recognize when a question requires aggregation, grouping, or both.
You can read the existing project schema as the starting point for richer summary work.

The Database Design and Modeling part ended with a design package that included a list of likely application queries, while Schemas and Constraints turned the design into real tables. This chapter is the bridge: the questions on that list are the same questions that we will learn to answer here.

Two Kinds of Questions

Some questions are about specific rows. They ask for one record or a small list of records that match a condition.

which exercises belong to a given course instance?
what is the most recent submission from a given student for a given exercise?
which submissions still need grading?

Earlier parts of the course already taught how to answer these. They are row questions. The result is a list of rows, each representing one stored record.

Other questions are about patterns across many rows. They ask for counts, totals, averages, extremes, or comparisons.

how many submissions has each exercise received?
what is the average score for a given exercise?
which courses have the most enrolled students?
how many submissions arrived per day during the last week?

These are summary questions. The result is no longer a list of stored records. It is a list of summaries, each representing many rows reduced to one number or one row.

The two kinds of questions need different SQL. Row questions are answered with the SELECT, WHERE, and JOIN patterns from the parts on Working with Single-Table Data and Data in Multiple Tables. Summary questions need the aggregate functions and GROUP BY clause that this part introduces.

Loading Exercise...

Going Back to the Design Lists

After you completed the Database Design and Modeling part, your subject-analysis note for the study tracker included questions tagged as “likely application queries.” Some of those questions were row questions, and the project may already answer them. Others were summary questions.

For instance, with the reviews table in place, the study tracker can support summary questions like:

how many reviews has each card received?
which cards have not been reviewed yet?
what is the average rating per deck?
how many reviews arrived per day during the last week?
which decks have the most review activity?

The same is true for the worked examples. The course-platform note from that part listed questions like “average score for a given exercise” and “students enrolled in a course instance but who have submitted nothing.” Those are summary questions. The recipe-app note implied questions like “which ingredients are used in the most recipes.” Same shape. The library note implied “how many active loans does each member have.” Same again.

A useful exercise before going further is to revisit your own design question list and mark each item as row or summary. The summary items are the ones this part should let you finally implement.

Loading Exercise...

Check Your Understanding

What is the difference between a row question and a summary question?
Why is it useful to revisit the design question list at the start of this part?
What are some examples of summary questions that the study tracker can support with the reviews table in place?

← Overview

Aggregate Functions →