Entities and Attributes
Learning Objectives
- You can decide which candidate subjects deserve their own entity and which are better represented as attributes.
- You can distinguish common attribute categories that matter for later relational design.
- You can recognize borderline cases where the right answer depends on what the system must remember.
After subject analysis, the next step is to decide which discovered ideas deserve their own identity and which should become properties of something else.
This is where the three-pass note from the previous chapters starts to settle into a cleaner shape. The candidate list includes entities, relationships, events, and roles. Not all of them belong in the same category in the final model.
What Makes Something an Entity?
An entity is usually something the system needs to identify, remember, and reason about separately.
In the course-platform domain, likely entities include:
- user,
- course,
- course instance,
- enrollment,
- exercise,
- submission.
Each of these is something the system refers to again later. A submission is created once and then read many times, compared to a deadline, scored, and possibly updated. Because the system has to refer to one specific submission repeatedly, it makes sense to give submissions their own identity.
In the recipe app from the previous chapter, the entities were user, recipe, ingredient, and preparation step (as a borderline case). In the library, they were member, book, copy, and loan.
What Makes Something an Attribute?
An attribute is a property of an entity.
Examples:
- a course has a
codeand atitle, - an exercise has a
titleand adeadline_at, - a submission has a
submitted_at, astatus, and possibly ascore.
A good practical test is this: If the thing needs its own identity, it may be an entity. If it mainly describes something else, it is often an attribute.
For example, deadline_at is naturally an attribute of an exercise. It does not need to be referred to independently; it only makes sense as “the deadline of this exercise.” By contrast, enrollment in the course-platform case deserves its own identity because the system may need to ask who is enrolled in which course instance, when the enrollment was created, and whether it is still active.
Different Kinds of Attributes
Not all attributes are equally simple. A few categories appear often enough to be worth naming.
Simple vs composite. A simple attribute holds one value, such as email. A composite attribute groups several related values that are used together, such as a street name and a postal code read together as an address. Composite attributes can be stored as several columns or as one combined value, depending on how the system uses them.
Single-valued vs multi-valued. A single-valued attribute holds one value per row, such as a course code. A multi-valued attribute would hold several values, such as a user’s several enrolled course instances. Multi-valued attributes almost always point to a relationship that deserves its own structure, rather than to a column full of commas.
Stored vs derived. A stored attribute is written directly into the table. A derived attribute is computed from other stored values. Age is often derived from date of birth; the number of submissions for an exercise is derived from the submissions table. Derived values usually do not need their own column.
Optional vs required. A required attribute must always have a value. An optional attribute may be missing. The score of a submission is optional until the submission has been graded.
These categories come back in the next part when the design becomes a SQL schema. A multi-valued attribute becomes a relationship. A required attribute becomes NOT NULL. A derived attribute either becomes a query or, in some cases, a cached column with extra care. Recognizing the category now makes the mapping later much smoother.
Multi-Valued Attributes Are a Warning Sign
If something looks multi-valued, it is often worth pausing.
For example, if one user can participate in many course instances, then “course_instance_ids” is probably not a single attribute stored as one comma-separated string in users. It likely points toward a separate relationship that will later need its own clearer structure in the schema.
The same pattern appeared in the recipe app: a recipe “has many ingredients” is not one attribute on recipe. It is a relationship between recipe and ingredient, and that relationship needs its own home.
Any attribute that reads as “a list of something else” is usually a relationship hiding inside a field. The right response is almost always to pull the relationship out into its own structure.
Contrastive Example: Submission Status as Attribute or Entity
Some cases genuinely depend on what the system must remember.
Consider “submission status” in the course-platform case:
- If it is only a simple label such as
submitted,graded, orlate, it can be an attribute ofsubmission. The table stores the current value, and that is all the system needs. - If the system needs to record every status change with a timestamp and an actor, then “status change” starts to look more like its own event. A submission might then have many status-change rows associated with it, each saying what the status changed to, when, and by whom.
Neither answer is wrong in the abstract. The right answer depends on what the system is supposed to support.
For each candidate attribute, ask:
- does the system need to remember this?
- will the application use it?
- do any of the expected queries depend on it?
If all three answers are no, the attribute probably does not belong in the model.
Check Your Understanding
- What makes something an entity rather than an attribute?
- Why is a candidate attribute that reads as “a list of something else” almost always a relationship?
- Why can two designers reasonably disagree about whether “teacher” should be an entity or a role?
AI Study Aid
Create a chapter diagram
We're looking into to what extent AI-generated diagrams could help with studying.
Use this study aid to generate an AI-generated visual summary of the material you just studied.
Each diagram style emphasizes the content in a different way, so you can choose the focus
that feels most useful.
Using the diagram generator is voluntary and does not affect your course progress. We encourage
you to try it out and see if it helps, but it's totally up to you! Your answers help us understand
how to make better study aids in the future.
Selected style: choose a diagram style
Design Exercise
Use this exercise to apply the chapter’s placement rules to a concrete mini-model. Decide which entity each attribute belongs to, and separate values that are usually derived rather than stored directly. If you choose a non-default placement, justify it using expected queries or application behavior.