Transactions, Normalization, and Performance

Looking Forward


By this point in the course, you have formed an understanding of what databases and database management systems are, you have learned to read data from a database with SQL, you have designed a schema from a case description, written the DDL to create the tables for the schema, written queries that summarize across many rows, evolved the schema safely through migrations, made writes correct under concurrent access, normalized the design to remove redundancy, and added indexes that keep things fast. The project at the end of the previous chapter is a working application that demonstrates all of those things together. That is real working knowledge of relational databases.

This course was deliberately focused. It covered the core of relational database work — design, implementation, querying, and maintenance — without trying to be comprehensive. The goal was to build a solid foundation in the essentials, not to cover every topic that touches databases. Naming what was left out is part of leaving the course honestly.

The first is production operations. The course made the project reproducible from a clean state, but a database in production has to survive failures, scale under real load, and be restorable when something goes wrong. Backup and restore — physical backups, logical dumps, point-in-time recovery — is its own discipline, as is replication for high availability, monitoring for query latency and disk space, and connection pooling.

The second is non-relational data models. The course was about relational databases on purpose, because the relational model is a good fit for learning about designing databases, writing queries, and maintaining a schema. For many applications, a relational schema is the right default. But document stores, key-value caches, graph databases, time-series databases, and search engines each fit some problems better than a relational schema would. Real systems often combine several database types. Recognizing when another model fits is part of mature database work; the relational fluency this course built transfers directly when you do.

The third is scale. The performance chapter addressed the scale of tiny-to-small projects: thousands to hundreds of thousands of rows. At millions or billions of rows, new concerns appear — partitioning, sharding, read replicas, more sophisticated index strategies. Such topics are important, but they are also out of scope for a course that focuses on the fundamentals. The best way to learn about scale is to build something that grows into it, and to read about how others have done that.

A few habits tend to keep database knowledge growing. Reading the documentation of the database system you are using, slowly, over months, covers a surprising amount; PostgreSQL’s documentation is very well written. Building a new project that pushes against your current ceiling teaches what abstractions actually break first. Reading other people’s schemas — open-source projects with database-backed components are an underrated resource — shows what conventions have settled into practice. Reading incident post-mortems shows what really goes wrong at scale, which is often different from what textbooks emphasize.

Beyond that, database work has many depths. Some readers will go deeper into operational engineering, others into building applications that use databases, others into analytics, others into specialized systems or research. You don’t have to choose now. The disciplines this course built — design carefully, implement faithfully, verify routinely — transfer across all of them. Pick the next thing that interests you and start — for example, the Web Software Development course might be a good next step.

Thank you for reading this far — the rest is practice.