DBMS Concepts Explained — Relational Model, Normalization & ACID
A Database Management System (DBMS) is software that stores, retrieves, and manages data in a structured way — providing concurrency, security, and integrity.
What You’ll Learn
In this tutorial, you’ll learn the relational model, normalization forms from 1NF to 5NF, ACID properties, transaction management, concurrency control, and how indexing speeds up queries.
Why It Matters
Every modern application relies on a database. Choosing the right schema design and understanding normalization prevents data anomalies, duplication, and corruption. Poor database design is the root cause of most performance and maintenance issues.
Real-World Use
When you book a flight, the airline’s DBMS ensures two people don’t get the same seat (concurrency control) and that your payment either fully completes or fully fails (atomicity). Doda Browser’s bookmark sync uses a relational database to store and query your bookmarks across devices.
graph LR
subgraph "Normalization Forms"
A[1NF
Atomic Values] --> B[2NF
No Partial Dependencies]
B --> C[3NF
No Transitive Dependencies]
C --> D[BCNF
Every determinant a key]
D --> E[4NF
No Multi-valued Deps]
E --> F[5NF
Join Dependency]
end
style A fill:#4f46e5,color:#fff
style B fill:#4f46e5,color:#fff
style C fill:#4f46e5,color:#fff
style D fill:#4f46e5,color:#fff
style E fill:#4f46e5,color:#fff
style F fill:#4f46e5,color:#fff
The Relational Model
The relational model organizes data into relations (tables) with tuples (rows) and attributes (columns). Each table has a primary key that uniquely identifies each row.
Keys
| Key Type | Definition |
|---|---|
| Candidate Key | Any column (or set) that could uniquely identify rows |
| Primary Key | The chosen candidate key |
| Foreign Key | A column referencing a primary key in another table |
| Composite Key | A primary key with multiple columns |
| Super Key | Any set of columns that uniquely identifies rows |
Normalization
Normalization eliminates data redundancy and anomalies through successive normal forms.
First Normal Form (1NF)
Each column must contain atomic (indivisible) values. No arrays or nested tables.
-- BAD (not 1NF): multiple values in one column
CREATE TABLE students_bad (
id INT PRIMARY KEY,
name VARCHAR(50),
courses VARCHAR(100) -- "Math,Physics,Chemistry"
);
-- GOOD (1NF): one value per column
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE enrollments (
student_id INT,
course VARCHAR(50),
PRIMARY KEY (student_id, course),
FOREIGN KEY (student_id) REFERENCES students(id)
);Second Normal Form (2NF)
Must be in 1NF AND every non-key column must depend on the entire primary key (no partial dependencies).
-- BAD (not 2NF): course_name depends only on course_id, not on student_id
CREATE TABLE grades_bad (
student_id INT,
course_id INT,
course_name VARCHAR(50), -- Partial dependency!
grade CHAR(2),
PRIMARY KEY (student_id, course_id)
);
-- GOOD (2NF): separate the partial dependency
CREATE TABLE courses (
course_id INT PRIMARY KEY,
course_name VARCHAR(50)
);
CREATE TABLE grades (
student_id INT,
course_id INT,
grade CHAR(2),
PRIMARY KEY (student_id, course_id),
FOREIGN KEY (course_id) REFERENCES courses(course_id)
);Third Normal Form (3NF)
Must be in 2NF AND no transitive dependencies (non-key column depends on another non-key column).
-- BAD (not 3NF): instructor_department depends on instructor, not course_id
CREATE TABLE courses_bad (
course_id INT PRIMARY KEY,
course_name VARCHAR(50),
instructor VARCHAR(50),
instructor_department VARCHAR(50) -- Transitive dependency!
);
-- GOOD (3NF)
CREATE TABLE instructors (
instructor_id INT PRIMARY KEY,
name VARCHAR(50),
department VARCHAR(50)
);
CREATE TABLE courses (
course_id INT PRIMARY KEY,
course_name VARCHAR(50),
instructor_id INT,
FOREIGN KEY (instructor_id) REFERENCES instructors(instructor_id)
);BCNF and Beyond
BCNF is a stricter version of 3NF where every determinant must be a candidate key. 4NF eliminates multi-valued dependencies. 5NF deals with join dependencies.
ACID Properties
ACID guarantees that database transactions are processed reliably:
| Property | Meaning | Analogy |
|---|---|---|
| Atomicity | All or nothing — a transaction completes fully or not at all | Bank transfer: money leaves one account AND arrives in another, or neither happens |
| Consistency | Only valid data is written — constraints and rules are enforced | Age must be positive; failing that constraint rolls back the transaction |
| Isolation | Concurrent transactions don’t interfere with each other | Two people booking the last seat — only one succeeds |
| Durability | Committed data persists even after a crash | After getting “payment successful,” the data survives a power outage |
Indexing
An index is a data structure (usually a B-tree or hash table) that speeds up lookups at the cost of slower writes.
-- Create an index on frequently queried columns
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- Query that benefits from the index
SELECT * FROM orders WHERE customer_id = 42;
-- Without index: full table scan (O(n))
-- With B-tree index: O(log n) lookupCommon Mistakes
- Skipping normalization entirely: Leads to massive data duplication and update anomalies. One change might need updates in 100 places.
- Over-normalizing: Normalizing to 5NF for a simple app creates joins on every query. Performance suffers. 3NF is usually enough.
- Ignoring foreign keys: Without FK constraints, orphaned rows accumulate — data integrity degrades silently.
- Not indexing foreign key columns: JOINs on unindexed FK columns are extremely slow on large tables.
- Confusing ACID with CAP: ACID applies to single-node databases. CAP theorem applies to distributed systems.
Practice Questions
What does ACID stand for? Atomicity, Consistency, Isolation, Durability — guarantees for reliable transaction processing.
What is the difference between 2NF and 3NF? 2NF removes partial dependencies (non-key depends on part of composite key). 3NF removes transitive dependencies (non-key depends on another non-key).
Why would you denormalize a database? To improve read performance by reducing JOINs, at the cost of write performance and data redundancy.
What is a B-tree index? A balanced tree data structure that allows O(log n) search, insert, and delete operations — the default index type in most databases.
What is a dirty read? Reading data written by an uncommitted transaction. If that transaction rolls back, you’ve read invalid data.
Challenge
Design a normalized schema for an e-commerce platform with products, customers, orders, order items, and reviews. Normalize to 3NF. Then identify where you might denormalize for performance.
Real-World Task
Inspect a database schema you work with. Identify any tables that violate 2NF or 3NF. What anomalies could occur?
Mini Project: Normalization Tool
Write a Python script that takes a denormalized CSV (e.g., orders with customer and product data in one row) and splits it into normalized tables in 3NF.
Security angle: Normalization also improves security — sensitive data like passwords should be in a separate table with restricted access, not duplicated across the database.
What’s Next
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
What’s Next
Congratulations on completing this DBMS Concepts tutorial! Here’s where to go from here:
- Practice daily — Consistency is more important than long study sessions
- Build a project — Apply what you learned by building something real
- Explore related topics — Check out other tutorials in the same category
- Join the community — Discuss with other learners and share your progress
Remember: every expert was once a beginner. Keep coding!
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro