Design Patterns and Anti-Patterns
Learning Objectives
- You can recognize practical design choices that often help relational schemas.
- You can spot a few common anti-patterns before they become a problem.
- You can describe design decisions as trade-offs rather than as universal rules.
Database design quality is not only about formal rules. There are also practical patterns that tend to help and practical warning signs that often predict trouble.
This chapter collects a handful of each. These are not theorems. They are habits that experienced designers tend to apply and habits that experienced designers tend to avoid. Learning them early means that when you see the pattern later, you will know what it suggests.
Useful Patterns
A few simple patterns appear again and again.
Lookup Tables
If a value belongs to a small controlled set, a lookup table can make the design clearer and easier to validate.
For example, a platform may keep a controlled set of submission statuses such as submitted, graded, and late. A lookup table makes the allowed values explicit, lets queries join to readable descriptions, and prevents free-form drift where rows start containing values like Submitted, submit, or Graded (needs review).
A small table such as roles(id, name) is often clearer than allowing values like student, Student, and learner to appear inconsistently across many user rows.
The trade-off is that a lookup table adds one more relationship to the schema. For very small controlled sets, a CHECK constraint on the allowed values is sometimes simpler. The choice depends on how often the set will change and how rich the descriptions need to be.
Explicit Relationship Structures
Many-to-many relationships are usually easier to manage when made explicit as their own structure.
In a later relational schema, this often becomes a separate table such as enrollments(user_id, course_instance_id). Once the relationship has its own rows, it can carry attributes like joined_at, it can be queried directly, and it does not have to be simulated through strange column patterns on one of the sides.
Audit Fields
Fields such as created_at or updated_at can be useful when the application needs to track changes over time.
In the course-platform schema, submitted_at is also an audit-style field because it records when an event happened. In the library case, both the borrow date and the return date on a loan play a similar role.
Audit fields are usually cheap to add and hard to reconstruct later. If you suspect the system will eventually need to know when a row was created or last changed, adding a timestamp now is much easier than reconstructing the information from indirect sources later.
Stable Identifiers for External References
If other systems or documents will refer to an entity from outside, it helps to have an identifier that is suitable for that use.
For example, a course code can serve as an external identifier that appears on diplomas, course catalogs, and student records. An order number can serve as an external identifier that appears on receipts. A surrogate id is fine for internal relationships but is usually not what external systems or documents should reference directly.
This pattern often leads to a combination: a surrogate id for internal use and a UNIQUE natural value for external reference.
Common Anti-Patterns
A few warning signs are worth learning early.
Comma-Separated Values in One Column
A column such as enrolled_course_instance_ids = '3,5,8' is almost always a sign that the design is hiding a relationship instead of modeling it properly.
The problems with this pattern are not subtle:
- the database cannot enforce that the referenced course instances exist,
- searching for all users enrolled in course instance 5 requires string matching instead of a real query,
- adding or removing one item requires rewriting the whole string,
- two applications updating the string at the same time may overwrite each other.
The fix is almost always to pull the list out into a proper relationship, usually a linking table.
Repeating Groups
If the schema contains columns like exercise1_deadline, exercise2_deadline, exercise3_deadline, that often means the design is trying to squeeze a repeating structure into one row.
The pattern becomes visibly broken as soon as the fourth exercise appears. The design either has to add another column (and the application has to learn about it) or has to refuse to store the fourth exercise.
The fix is to make the repeating thing its own entity. Instead of columns on the parent, use rows in a child table with a foreign key back to the parent.
God Tables
A huge table that tries to store many unrelated concerns often leads to duplicated facts, awkward queries, and hard updates.
A table such as:
course_activity(user_name, user_email, course_code, course_title, term, exercise_title, deadline_at, submitted_at, score, role)
may feel convenient at first because one query can return everything about one row. But this table now stores user facts, course facts, exercise facts, and submission facts in a single row shape. If a course title changes, many rows have to be updated. If a user email changes, the same. If the system needs to add a course before any exercises exist, the row shape does not fit.
The Data Distribution and Redundancy chapter in the Part Data in Multiple Tables discussed these problems as update, insert, and delete anomalies. The god-table pattern is the main way those anomalies end up in a schema that could have avoided them.
A Column That Means Two Different Things
If one column is used to store two different kinds of value depending on a flag in another column, the design is probably carrying unrelated concerns in the same place. For example:
value TEXTthat holds either a text answer or a numeric score depending on the value oftype.
These designs work until a feature needs to filter or aggregate on the column, at which point the type-dependent meaning becomes a source of bugs.
The fix is usually to split the concerns into separate columns or separate entities.
Nullable Flags That Try to Mean “Deleted”
Sometimes a design adds a deleted flag or a nullable deleted_at column to avoid really removing rows. This is called soft deletion, and it can be a reasonable choice, but it is easy to do badly.
The main risk is that every query in the system has to remember to exclude deleted rows. Miss one query, and the deleted data silently reappears somewhere it should not.
Soft deletion is a pattern worth understanding but also worth using sparingly. When it is the right choice, the design should make the handling explicit throughout, not rely on every query to remember.
Design Smells as Signals
Whenever you observe a design smell, the useful habit is to pause and ask:
- is the value really atomic, or is it a hidden list?
- is there hidden repetition that should be its own entity?
- is a relationship being stored as text instead of modeled directly?
- does any column carry two different kinds of meaning?
- does any table try to store too many different concerns in one row?
Those questions prepare the ground for the more formal normalization ideas later in the course, and they catch most design problems before normalization would need to.
The redundancy problems outlined in the previous part showed why repeated facts make querying and updates fragile. The same intuition now appears as an early design habit: if the same fact seems to be copied into many places, the model probably deserves another look.
Before-and-After Example
A complete before-and-after to keep in mind:
Before:
users(id, name, email, enrolled_course_instance_ids, submitted_exercise_1, submitted_exercise_2, submitted_exercise_3)
After:
users(id, name, email),course_instances(id, course_id, term, year),enrollments(user_id, course_instance_id, joined_at),exercises(id, course_instance_id, title, deadline_at),submissions(id, user_id, exercise_id, submitted_at, score, status).
The before version has two anti-patterns at once: a comma-separated list of references and a repeated group of submission columns. The after version replaces each anti-pattern with a normal relationship structure.
Anti-patterns are not crimes. Each one might be the right choice in some specific context where the trade-offs work out. But when they appear in a design that does not clearly justify them, they are usually a sign that the design has not been revisited.
Check Your Understanding
- What is one sign that a conceptual design is repeating the same fact too many times?
- Why is a comma-separated list in a single column usually a design problem?
- Why is it useful to describe anti-patterns as signals rather than as forbidden forms?
Design Exercise
Use the following as a focused repair task: start from the flawed fragment, identify the anti-patterns it contains, and refactor the structure so relationships and repeating facts are modeled explicitly.
As you revise, explain each change as a trade-off decision rather than a rule: what problem the original shape causes, and why your new structure makes updates, querying, and validation more reliable.