The Database Design Process
Learning Objectives
- You can describe the main stages of database design at a high level.
- You can explain why requirements and expected questions should shape the design before any SQL is written.
- You can recognize when a design process has skipped ahead too quickly.
Earlier parts often began from a schema that already existed. In this part, we change the perspective. Now the question is not “how do we query these tables?” but “what tables should exist in the first place, and why?”
A database is not only a place to store facts. It is also a structure that should support future questions and future actions. Good design starts from the domain and its expected use, not from table names that sound familiar.
One of the most useful questions in database design is this: what kinds of things does this system need to remember so that it can answer the needed questions later? That question connects design to real use. It keeps the model focused on facts the system must store, instead of drifting into diagrams that look impressive but do not support any concrete feature.
Start from Questions, Not from CREATE TABLE
Suppose we are building a course platform. A weak starting point would be to immediately write tables such as courses, students, and submissions simply because those names sound familiar.
A better starting point is to ask what the system needs to remember and what the system must later answer. For example:
- Which exercises belong to one course instance?
- Which students are enrolled in that instance?
- Which submissions still need grading?
- Can one student submit several times for the same exercise?
Those questions already point toward structure. They suggest that the design must make courses, course instances, exercises, users, enrollments, and submissions visible in some way. In other words, the questions start shaping the model before any SQL exists.
A useful habit is to list the questions the system must answer before listing the tables. If a question cannot be answered from the draft design, the design is incomplete, no matter how neat the design looks.
Conceptual, Logical, and Physical Design
It is common to describe database design at three levels.
Conceptual design decides what the domain is really about. It names the main entities, their attributes, and the relationships between them. At this level, there is no SQL.
Logical design turns the conceptual model into a relational schema. Tables, primary keys, foreign keys, and constraints appear here. This is the level that the next part of the course focuses on.
Physical design deals with implementation details such as indexes, storage choices, and performance tuning. This level appears briefly in later parts of the course.
This part focuses mostly on conceptual design. The choices you make here show up later as tables and constraints, but the quality of those tables depends on whether the conceptual work was solid.
A weak conceptual design leads to tables that are hard to query, hard to update, and hard to summarize. It also leads to a physical design that is inefficient and difficult to maintain. On the other hand, a strong conceptual design leads to tables that are easy to query, easy to update, and easy to summarize. It also leads to a physical design that is efficient and easy to maintain.
A Practical Design Loop
When practising database design, we typically focus on case or domain descriptions. These are narratives that describe the domain and its expected use. They often include example questions that the system should be able to answer.
A useful workflow is:
- read the domain description and identify the main user actions,
- list the questions the system should be able to answer,
- sketch the main subjects, attributes, and relationships,
- test the sketch against the example questions,
- refine the sketch where a question cannot be answered,
- only then move toward relational schema design and SQL.
A first sketch often reveals ambiguities that send the designer back to the case description or to the question list. Start rough, notice ambiguities, refine the design, test it against the example questions, and then improve it again. The process moves forward, but it rarely does so in one perfectly clean straight line.
The next chapter introduces a concrete reading method, called subject analysis, that supports step 1 of this loop. The chapters after that continue down the loop: entities and attributes (step 3), relationships (step 3), ER diagrams (step 3), testing against questions (step 4), and identifiers and patterns as ways to refine the sketch (step 5).
Check Your Understanding
- What is the difference between conceptual, logical, and physical design?
- Why is it risky to jump straight to table definitions too early?
- Why is it helpful to start from the questions the system must answer?
AI Study Aid
Create a chapter diagram
We're looking into to what extent AI-generated diagrams could help with studying.
Use this study aid to generate an AI-generated visual summary of the material you just studied.
Each diagram style emphasizes the content in a different way, so you can choose the focus
that feels most useful.
Using the diagram generator is voluntary and does not affect your course progress. We encourage
you to try it out and see if it helps, but it's totally up to you! Your answers help us understand
how to make better study aids in the future.
Selected style: choose a diagram style