SQL Primary Keys and Foreign Keys for Table Relationships
Every real-world database contains multiple tables that are connected to each other. Customers place orders. Orders contain products. Products belong to categories. These connections are not accidental. They are carefully designed relationships that reflect how data naturally relates in the real world.
Two concepts make these relationships work: primary keys and foreign keys. A primary key uniquely identifies each row in a table. A foreign key creates a link from one table to another by referencing that primary key. Together, they form the backbone of the relational in relational databases.
Understanding these concepts is not optional. Every JOIN you write depends on them. Every database you design requires them. Every data integrity problem you debug traces back to them. This guide covers primary keys, foreign keys, and the three types of relationships they create (one-to-one, one-to-many, many-to-many), all illustrated through the ShopSmart database you have been working with (we defined it in a previous guide here, and then we extended it here).
Why Tables Need Relationships
Imagine storing all your e-commerce data in a single table:
| order_id | customer_name | customer_email | customer_city | product_name | product_price | category_name | quantity | order_date |
|---|---|---|---|---|---|---|---|---|
| 1 | Alice Johnson | alice@email.com | New York | Wireless Mouse | 29.99 | Electronics | 2 | 2024-01-10 |
| 1 | Alice Johnson | alice@email.com | New York | USB-C Hub | 45.00 | Electronics | 1 | 2024-01-10 |
| 2 | Bob Martinez | bob@email.com | Los Angeles | Mechanical Keyboard | 89.99 | Electronics | 1 | 2024-01-15 |
| 3 | Alice Johnson | alice@email.com | New York | SQL for Beginners | 34.99 | Books | 1 | 2024-02-20 |
This approach has severe problems:
Data duplication. Alice's name, email, and city are repeated on every row involving her. If she has 50 orders with 3 items each, her information appears 150 times.
Update anomalies. If Alice moves to Boston, you need to update her city in every single row. Miss one, and your data is inconsistent.
Deletion anomalies. If you delete Alice's only order for "SQL for Beginners," you lose the fact that the product exists entirely.
Insertion anomalies. You cannot add a new product to your catalog without creating a fake order for it.
The solution is normalization: splitting data into separate tables and connecting them through relationships.
-- Instead of one massive table, we use separate tables:
-- customers (customer data lives here once)
-- products (product data lives here once)
-- categories (category data lives here once)
-- orders (order data, references customers)
-- order_items (line items, references orders and products)
Alice's information is stored once in the customers table. Products are stored once in the products table. Orders reference customers by ID. Order items reference both orders and products by their IDs. No duplication. No anomalies.
Relational databases store each piece of information exactly once and use references (keys) to connect related data across tables. This eliminates redundancy and keeps your data consistent.
Primary Keys Explained
A primary key (PK) is a column (or set of columns) that uniquely identifies every row in a table. No two rows can share the same primary key value, and the value can never be NULL.
Why Primary Keys Matter
Without a primary key, you cannot reliably distinguish one row from another:
| first_name | last_name | city |
|---|---|---|
| John | Smith | New York |
| John | Smith | Chicago |
| John | Smith | New York |
Which John Smith in New York do you want to update? There is no way to tell. Add a primary key, and every row becomes uniquely addressable:
| id | first_name | last_name | city |
|---|---|---|---|
| 1 | John | Smith | New York |
| 2 | John | Smith | Chicago |
| 3 | John | Smith | New York |
Now id = 1 and id = 3 are unambiguously different rows, even though they have the same name and city.
Primary Key Rules
Every primary key must satisfy these requirements:
- Unique: No two rows can have the same primary key value
- Not NULL: Every row must have a primary key value
- Immutable (best practice): Once assigned, the value should never change
- Single per table: Each table can have only one primary key
Creating Primary Keys
-- Integer primary key with auto-increment (most common pattern)
-- PostgreSQL
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL
);
-- MySQL
CREATE TABLE customers (
id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL
);
-- SQLite
CREATE TABLE customers (
id INTEGER PRIMARY KEY AUTOINCREMENT,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL
);
With auto-increment, the database assigns the next available number automatically when you insert a row:
INSERT INTO customers (first_name, last_name, email)
VALUES ('Alice', 'Johnson', 'alice@email.com');
-- id = 1 (assigned automatically)
INSERT INTO customers (first_name, last_name, email)
VALUES ('Bob', 'Martinez', 'bob@email.com');
-- id = 2 (assigned automatically)
The Primary Key the Database Enforces
If you try to violate primary key rules, the database stops you:
-- Attempt to insert a duplicate primary key
INSERT INTO customers (id, first_name, last_name, email)
VALUES (1, 'Charlie', 'Brown', 'charlie@email.com');
-- ERROR: duplicate key value violates unique constraint "customers_pkey"
-- Attempt to insert a NULL primary key
INSERT INTO customers (id, first_name, last_name, email)
VALUES (NULL, 'Charlie', 'Brown', 'charlie@email.com');
-- ERROR: null value in column "id" violates not-null constraint
These constraints are enforced automatically by the database engine. You do not need to write any code to validate them.
Natural Keys vs Surrogate Keys
There are two approaches to choosing what serves as a primary key:
Natural key: A value that has real-world meaning and is naturally unique.
CREATE TABLE countries (
country_code CHAR(2) PRIMARY KEY, -- 'US', 'UK', 'DE'
country_name VARCHAR(100) NOT NULL
);
Surrogate key: An artificial value with no real-world meaning, typically an auto-incrementing integer.
CREATE TABLE countries (
id INTEGER PRIMARY KEY,
country_code CHAR(2) UNIQUE NOT NULL,
country_name VARCHAR(100) NOT NULL
);
For most tables, surrogate keys are the safer choice. Natural keys can change over time (an email address might be updated, a country code might be reassigned), and any change to a primary key cascades to every table that references it. Surrogate keys remain stable forever.
You can still enforce uniqueness on natural identifiers using the UNIQUE constraint while keeping a stable surrogate key as the primary key.
Foreign Keys Explained
A foreign key (FK) is a column in one table that references the primary key of another table. It creates a link between the two tables, establishing a relationship.
How Foreign Keys Work
In the ShopSmart database, each order belongs to a customer. The orders table has a customer_id column that references the id column in the customers table:
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL,
order_date DATE NOT NULL,
total_amount DECIMAL(10, 2),
status VARCHAR(20) DEFAULT 'pending',
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
The line FOREIGN KEY (customer_id) REFERENCES customers(id) tells the database: "The value in customer_id must match an existing id in the customers table."
Visually:
customers table orders table
┌────┬────────────┐ ┌────┬─────────────┬────────────┐
│ id │ first_name │ │ id │ customer_id │ order_date │
├────┼────────────┤ ├────┼─────────────┼────────────┤
│ 1 │ Alice │◄─────────│ 1 │ 1 │ 2024-01-10 │
│ 2 │ Bob │◄────┐ │ 2 │ 2 │ 2024-01-15 │
│ 3 │ Carol │ └────│ 3 │ 2 │ 2024-02-20 │
└────┴────────────┘ │ 4 │ 1 │ 2024-03-05 │
└────┴─────────────┴────────────┘
Order 1 and Order 4 both reference customer 1 (Alice). Order 2 and Order 3 both reference customer 2 (Bob). The customer_id foreign key creates the link.
What Foreign Keys Enforce
Foreign keys enforce referential integrity, meaning you cannot create references to data that does not exist:
-- Attempt to create an order for a non-existent customer
INSERT INTO orders (id, customer_id, order_date, total_amount, status)
VALUES (100, 999, '2024-05-01', 50.00, 'pending');
-- ERROR: insert or update on table "orders" violates foreign key constraint
-- Key (customer_id)=(999) is not present in table "customers"
The database rejects this insert because there is no customer with id = 999. Without this constraint, you could end up with "orphaned" orders that reference non-existent customers, making your data inconsistent.
Foreign keys also prevent you from deleting referenced data:
-- Attempt to delete a customer who has orders
DELETE FROM customers WHERE id = 1;
-- ERROR: update or delete on table "customers" violates foreign key constraint
-- Key (id)=(1) is still referenced from table "orders"
You cannot delete Alice while her orders still exist. This prevents orphaned records.
Technically, you can create tables without foreign key constraints. The database will not stop you. But without them, nothing prevents invalid references, and your data integrity depends entirely on your application code being perfect.
-- Without foreign key constraint: Database allows invalid data
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL, -- No foreign key constraint
order_date DATE
);
INSERT INTO orders (id, customer_id, order_date)
VALUES (1, 9999, '2024-01-01');
-- No error! But customer 9999 does not exist.
-- Your data is now inconsistent.
-- With foreign key constraint: Database catches the error
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL,
order_date DATE,
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
INSERT INTO orders (id, customer_id, order_date)
VALUES (1, 9999, '2024-01-01');
-- ERROR: foreign key constraint violation
Always define foreign key constraints. Let the database protect your data.
ON DELETE and ON UPDATE Behavior
When defining a foreign key, you can specify what happens when the referenced row is deleted or updated:
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL,
order_date DATE,
-- Different ON DELETE options:
FOREIGN KEY (customer_id) REFERENCES customers(id)
ON DELETE CASCADE -- Delete all related orders when customer is deleted
ON UPDATE CASCADE -- Update customer_id in orders if customer id changes
);
| Option | Behavior |
|---|---|
RESTRICT (default) | Prevents deletion/update if references exist |
CASCADE | Automatically deletes/updates referencing rows |
SET NULL | Sets the foreign key column to NULL |
SET DEFAULT | Sets the foreign key column to its default value |
NO ACTION | Similar to RESTRICT (checked at end of transaction) |
-- Example: If a category is deleted, set products' category_id to NULL
CREATE TABLE products (
id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
category_id INTEGER,
FOREIGN KEY (category_id) REFERENCES categories(id)
ON DELETE SET NULL
);
- Use
RESTRICT(the default) for most relationships. It forces you to explicitly handle dependencies before deleting. - Use
CASCADEwhen child records have no meaning without the parent (e.g., delete an order and all its order items). - Use
SET NULLwhen the child can exist independently (e.g., if a category is removed, the product still exists but loses its category).
The Three Types of Relationships
Primary keys and foreign keys create three fundamental types of relationships between tables.
One-to-Many (1:N)
The most common relationship. One row in the parent table relates to many rows in the child table.
Example: One customer can place many orders, but each order belongs to one customer.
customers (1) ──────── (N) orders
│ │
│ id = 1 (Alice) │ customer_id = 1, order_id = 1
│ │ customer_id = 1, order_id = 3
│ │ customer_id = 1, order_id = 9
│ │
│ id = 2 (Bob) │ customer_id = 2, order_id = 2
│ │ customer_id = 2, order_id = 7
In SQL, one-to-many relationships are created by placing a foreign key in the "many" table that references the "one" table:
-- The "one" side: customers
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL
);
-- The "many" side: orders (has the foreign key)
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL,
order_date DATE NOT NULL,
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
Querying a one-to-many relationship:
-- Find all orders for a specific customer
SELECT o.id AS order_id, o.order_date, o.total_amount
FROM orders o
WHERE o.customer_id = 1;
Output:
| order_id | order_date | total_amount |
|---|---|---|
| 1 | 2024-01-10 | 119.98 |
| 3 | 2024-02-20 | 77.49 |
-- Count orders per customer
SELECT c.first_name || ' ' || c.last_name AS customer,
COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.first_name, c.last_name
ORDER BY order_count DESC;
Output:
| customer | order_count |
|---|---|
| Alice Johnson | 2 |
| Bob Martinez | 2 |
| Carol Singh | 1 |
| David Chen | 1 |
| Eva Brown | 1 |
| Frank Wilson | 1 |
| Grace Taylor | 0 |
More one-to-many examples in ShopSmart:
| One Side | Many Side | Foreign Key |
|---|---|---|
categories (1 category) | products (many products) | products.category_id → categories.id |
orders (1 order) | order_items (many items) | order_items.order_id → orders.id |
products (1 product) | reviews (many reviews) | reviews.product_id → products.id |
customers (1 customer) | reviews (many reviews) | reviews.customer_id → customers.id |
One-to-One (1:1)
A one-to-one relationship means one row in table A relates to exactly one row in table B. This is less common but useful for splitting a table for organizational, security, or performance reasons.
Example: Each customer has exactly one profile with extended details.
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL
);
CREATE TABLE customer_profiles (
id INTEGER PRIMARY KEY,
customer_id INTEGER UNIQUE NOT NULL, -- UNIQUE enforces one-to-one
date_of_birth DATE,
phone VARCHAR(20),
bio TEXT,
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
The key detail is the UNIQUE constraint on customer_id. Without it, multiple profiles could reference the same customer (making it one-to-many). The UNIQUE constraint ensures each customer can have at most one profile.
customers (1) ──────── (1) customer_profiles
│ │
│ id = 1 (Alice) │ customer_id = 1 (Alice's profile)
│ id = 2 (Bob) │ customer_id = 2 (Bob's profile)
Querying a one-to-one relationship:
SELECT c.first_name || ' ' || c.last_name AS customer,
cp.phone,
cp.date_of_birth
FROM customers c
JOIN customer_profiles cp ON c.id = cp.customer_id;
When to use one-to-one relationships:
- Splitting sensitive data: Keep passwords or payment info in a separate, more tightly secured table
- Separating optional data: Not every customer has a profile, so optional fields go in a secondary table rather than adding nullable columns to the main table
- Performance: Frequently queried columns stay in a compact table, while rarely accessed large text or blob columns live in a separate table
Many-to-Many (M:N)
A many-to-many relationship means multiple rows in table A relate to multiple rows in table B. This cannot be represented with a single foreign key. It requires a junction table (also called a bridge table, linking table, or associative table).
Example: One order contains many products, and one product appears in many orders.
You cannot put a product_id in the orders table (an order has multiple products). You cannot put an order_id in the products table (a product appears in multiple orders). You need a third table that sits between them:
orders (M) ──── order_items (junction) ──── (N) products
-- The junction table that connects orders and products
CREATE TABLE order_items (
id INTEGER PRIMARY KEY,
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER NOT NULL CHECK (quantity > 0),
unit_price DECIMAL(10, 2) NOT NULL,
FOREIGN KEY (order_id) REFERENCES orders(id),
FOREIGN KEY (product_id) REFERENCES products(id)
);
The order_items table has two foreign keys: one pointing to orders and one pointing to products. Each row in order_items represents one product within one order.
orders order_items products
┌────┬────────┐ ┌────┬──────┬─────┐ ┌────┬─────────────────┐
│ id │ date │ │ id │order │prod │ │ id │ name │
├────┼────────┤ ├────┼──────┼─────┤ ├────┼─────────────────┤
│ 1 │ Jan 10 │◄─────│ 1 │ 1 │ 1 │─────►│ 1 │ Wireless Mouse │
│ │ │◄─────│ 2 │ 1 │ 3 │─────►│ 3 │ USB-C Hub │
│ 2 │ Jan 15 │◄─────│ 3 │ 2 │ 2 │─────►│ 2 │ Mech. Keyboard │
└────┴────────┘ └────┴──────┴─────┘ └────┴─────────────────┘
Order 1 contains products 1 and 3. Product 2 appears in order 2. Product 1 could also appear in other orders (many-to-many).
Querying a many-to-many relationship:
-- Find all products in a specific order
SELECT o.id AS order_id,
p.name AS product,
oi.quantity,
oi.unit_price,
oi.quantity * oi.unit_price AS line_total
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.id = 1;
Output:
| order_id | product | quantity | unit_price | line_total |
|---|---|---|---|---|
| 1 | Wireless Mouse | 2 | 29.99 | 59.98 |
| 1 | USB-C Hub | 1 | 45.00 | 45.00 |
-- Find all orders that contain a specific product
SELECT o.id AS order_id,
o.order_date,
oi.quantity,
c.first_name || ' ' || c.last_name AS customer
FROM products p
JOIN order_items oi ON p.id = oi.product_id
JOIN orders o ON oi.order_id = o.id
JOIN customers c ON o.customer_id = c.id
WHERE p.name = 'Mechanical Keyboard';
Output:
| order_id | order_date | quantity | customer |
|---|---|---|---|
| 2 | 2024-01-15 | 1 | Bob Martinez |
| 8 | 2024-04-10 | 1 | Frank Wilson |
The Mechanical Keyboard appears in 2 different orders from 2 different customers. This is the many-to-many relationship in action.
Junction tables often carry additional data beyond just the two foreign keys. In order_items, we store quantity and unit_price because these values are specific to the relationship itself (the combination of this order and this product), not to either entity alone.
Other common junction table patterns:
-- Students enrolled in courses (many-to-many)
CREATE TABLE enrollments (
student_id INTEGER REFERENCES students(id),
course_id INTEGER REFERENCES courses(id),
enrolled_date DATE,
grade VARCHAR(2),
PRIMARY KEY (student_id, course_id) -- Composite primary key
);
-- Users following other users (many-to-many, self-referencing)
CREATE TABLE follows (
follower_id INTEGER REFERENCES users(id),
following_id INTEGER REFERENCES users(id),
followed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (follower_id, following_id)
);
Notice the composite primary key (PRIMARY KEY (student_id, course_id)) that prevents duplicate enrollments. This is a common pattern where the combination of two foreign keys serves as the primary key.
Reading the ShopSmart ER Diagram
An Entity-Relationship (ER) diagram is a visual representation of the tables in a database and the relationships between them. Here is the complete structure of the ShopSmart database:
┌──────────────┐ ┌──────────────┐ ┌──────────────────┐
│ categories │ │ products │ │ reviews │
├──────────────┤ ├──────────────┤ ├──────────────────┤
│ * id (PK)│───┐ │ * id (PK)│───┐ │ * id (PK)│
│ name │ │ │ name │ │ │ product_id (FK)│──┐
│ description│ └──►│ category_id│ │ │ customer_id(FK)│ │
│ │ │ price │ └──►│ rating │ │
│ │ │ stock_qty │ │ review_date │ │
│ │ │ is_avail. │ └──────────────────┘ │
│ │ └──────────────┘ │
└──────────────┘ │ │
│ (via order_items) │
┌──────────────┐ ┌──────────────┐ │
│ customers │ │ order_items │ │
├──────────────┤ ├──────────────┤ │
│ * id (PK)│───┐ │ * id (PK)│ │
│ first_name │ │ │ order_id │(FK) │
│ last_name │ │ │ product_id │(FK)───────────────────────┘
│ email │ │ │ quantity │
│ city │ │ │ unit_price │
│ signup_date│ │ └──────────────┘
└──────────────┘ │ ▲
│ │ │
│ │ ┌──────────────┐
│ │ │ orders │
│ │ ├──────────────┤
│ └──►│ * id (PK)│
│ │ customer_id│(FK)
└─────────────►│ order_date │
│ total_amt │
│ status │
└──────────────┘
Reading the Diagram
* id (PK)marks the primary key of each table(FK)marks foreign key columns- Arrows point from the foreign key to the primary key it references
- Each arrow represents a one-to-many relationship (the "one" side has the PK, the "many" side has the FK)
The Relationships in ShopSmart
| Relationship | Type | FK Column | References |
|---|---|---|---|
| Categories → Products | One-to-many | products.category_id | categories.id |
| Customers → Orders | One-to-many | orders.customer_id | customers.id |
| Orders → Order Items | One-to-many | order_items.order_id | orders.id |
| Products → Order Items | One-to-many | order_items.product_id | products.id |
| Products → Reviews | One-to-many | reviews.product_id | products.id |
| Customers → Reviews | One-to-many | reviews.customer_id | customers.id |
| Orders ↔ Products | Many-to-many | via order_items | Junction table |
The order_items table serves as the junction table for the many-to-many relationship between orders and products.
Traversing Relationships with JOINs
Every relationship in the ER diagram corresponds to a JOIN condition in your queries:
-- Traverse: categories → products
SELECT c.name, p.name
FROM categories c
JOIN products p ON c.id = p.category_id;
-- Traverse: customers → orders
SELECT c.first_name, o.order_date
FROM customers c
JOIN orders o ON c.id = o.customer_id;
-- Traverse: orders → order_items → products (many-to-many)
SELECT o.id, p.name, oi.quantity
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id;
-- Traverse the full chain: customers → orders → order_items → products → categories
SELECT c.first_name || ' ' || c.last_name AS customer,
o.order_date,
p.name AS product,
cat.name AS category,
oi.quantity,
oi.unit_price
FROM customers c
JOIN orders o ON c.id = o.customer_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
JOIN categories cat ON p.category_id = cat.id
ORDER BY o.order_date, c.last_name;
Output:
| customer | order_date | product | category | quantity | unit_price |
|---|---|---|---|---|---|
| Alice Johnson | 2024-01-10 | Wireless Mouse | Electronics | 2 | 29.99 |
| Alice Johnson | 2024-01-10 | USB-C Hub | Electronics | 1 | 45.00 |
| Bob Martinez | 2024-01-15 | Mechanical Keyboard | Electronics | 1 | 89.99 |
| Alice Johnson | 2024-02-20 | SQL for Beginners | Books | 1 | 34.99 |
| Alice Johnson | 2024-02-20 | Stainless Water Bottle | Sports | 1 | 24.99 |
| Carol Singh | 2024-03-05 | Coffee Maker Pro | Home & Kitchen | 1 | 129.99 |
| David Chen | 2024-03-12 | Running Shoes X1 | Sports | 1 | 110.00 |
| David Chen | 2024-03-12 | USB-C Hub | Electronics | 1 | 45.00 |
| Eva Brown | 2024-03-20 | SQL for Beginners | Books | 1 | 34.99 |
| Bob Martinez | 2024-04-01 | Yoga Mat Premium | Sports | 1 | 38.00 |
| Bob Martinez | 2024-04-01 | Stainless Water Bottle | Sports | 1 | 24.99 |
| Frank Wilson | 2024-04-10 | Mechanical Keyboard | Electronics | 1 | 89.99 |
| Frank Wilson | 2024-04-10 | Running Shoes X1 | Sports | 1 | 110.00 |
This single query traverses five tables through four JOIN operations, following the foreign key relationships laid out in the ER diagram.
Designing Relationships: Practical Guidelines
When you design your own database, use these guidelines to determine the right relationship type:
Identifying Relationship Types
Ask this question about each pair of entities:
| Question | If Answer is... | Relationship Type |
|---|---|---|
| Can one A have multiple Bs? | Yes | At least one-to-many |
| Can one B have multiple As? | Also yes | Many-to-many |
| Can one B have multiple As? | No | One-to-many |
| Can each A have only one B, and vice versa? | Yes | One-to-one |
Example thought process:
-
Can one customer have multiple orders? Yes. → At least one-to-many
-
Can one order belong to multiple customers? No. → One-to-many (customer → orders)
-
Can one order contain multiple products? Yes. → At least one-to-many
-
Can one product appear in multiple orders? Yes. → Many-to-many (needs junction table)
Common Mistakes to Avoid
Mistake 1: Putting the foreign key on the wrong side
-- Wrong: The foreign key goes on the "many" side, not the "one" side
-- A customer does NOT store an order_id (a customer has MANY orders)
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
name VARCHAR(100),
order_id INTEGER -- WRONG! Where would you put the second order?
);
-- Correct: Orders store the customer_id (each order has ONE customer)
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER REFERENCES customers(id) -- CORRECT
);
Mistake 2: Storing multiple values in one column
-- Wrong: Storing multiple product IDs in a comma-separated string
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
product_ids VARCHAR(100) -- '1,3,7' — TERRIBLE idea
);
-- Correct: Use a junction table
CREATE TABLE order_items (
order_id INTEGER REFERENCES orders(id),
product_id INTEGER REFERENCES products(id),
quantity INTEGER
);
Storing comma-separated values like '1,3,7' in a single column violates fundamental database design principles. You cannot efficiently query, join, or validate the data. You cannot use foreign key constraints. You cannot count, sum, or filter individual values. Always use a proper junction table for many-to-many relationships.
Mistake 3: Skipping foreign key constraints
-- Dangerous: No constraint means no protection
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER -- No REFERENCES clause
);
-- Safe: Constraint prevents orphaned records
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER REFERENCES customers(id)
);
Practical Exercises
Exercise 1
Write a query that finds how many products belong to each category. Include the category name.
SELECT c.name AS category,
COUNT(p.id) AS product_count
FROM categories c
LEFT JOIN products p ON c.id = p.category_id
GROUP BY c.id, c.name
ORDER BY product_count DESC;
Expected output:
| category | product_count |
|---|---|
| Electronics | 4 |
| Sports | 3 |
| Books | 2 |
| Home & Kitchen | 1 |
Exercise 2
Write a query to find all products that have been ordered along with how many different orders they appeared in.
SELECT p.name AS product,
COUNT(DISTINCT oi.order_id) AS order_count
FROM products p
JOIN order_items oi ON p.id = oi.product_id
GROUP BY p.id, p.name
ORDER BY order_count DESC;
Expected output:
| product | order_count |
|---|---|
| Mechanical Keyboard | 2 |
| USB-C Hub | 2 |
| SQL for Beginners | 2 |
| Running Shoes X1 | 2 |
| Stainless Water Bottle | 2 |
| Wireless Mouse | 1 |
| Coffee Maker Pro | 1 |
| Yoga Mat Premium | 1 |
Exercise 3
Write a query that shows each customer and all the products they have ever ordered.
SELECT DISTINCT
c.first_name || ' ' || c.last_name AS customer,
p.name AS product
FROM customers c
JOIN orders o ON c.id = o.customer_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
ORDER BY customer, product;
Expected output:
| customer | product |
|---|---|
| Alice Johnson | SQL for Beginners |
| Alice Johnson | Stainless Water Bottle |
| Alice Johnson | USB-C Hub |
| Alice Johnson | Wireless Mouse |
| Bob Martinez | Mechanical Keyboard |
| Bob Martinez | Stainless Water Bottle |
| Bob Martinez | Yoga Mat Premium |
| Carol Singh | Coffee Maker Pro |
| David Chen | Running Shoes X1 |
| David Chen | USB-C Hub |
| Eva Brown | SQL for Beginners |
| Frank Wilson | Mechanical Keyboard |
| Frank Wilson | Running Shoes X1 |
Exercise 4
Find customers who have reviewed products they also purchased.
SELECT DISTINCT
c.first_name || ' ' || c.last_name AS customer,
p.name AS product
FROM customers c
JOIN orders o ON c.id = o.customer_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
JOIN reviews r ON r.customer_id = c.id AND r.product_id = p.id
ORDER BY customer, product;
| customer | product |
|---|---|
| Alice Johnson | Mechanical Keyboard |
| Alice Johnson | SQL for Beginners |
| Alice Johnson | Wireless Mouse |
| Bob Martinez | Stainless Water Bottle |
| Bob Martinez | Yoga Mat Premium |
| Carol Singh | Coffee Maker Pro |
| Carol Singh | Stainless Water Bottle |
| David Chen | Running Shoes X1 |
| Frank Wilson | Running Shoes X1 |
Exercise 5
Write a query that identifies products with no orders (using LEFT JOIN).
SELECT p.name AS product,
p.price
FROM products p
LEFT JOIN order_items oi ON p.id = oi.product_id
WHERE oi.id IS NULL
ORDER BY p.name;
Expected output:
| product | price |
|---|---|
| Bluetooth Speaker | 65.00 |
| Data Science Handbook | 42.50 |
Key Takeaways
Primary keys and foreign keys are the foundation of every relational database. Here is what you should remember:
- A primary key (PK) uniquely identifies each row in a table. It must be unique and never NULL.
- A foreign key (FK) creates a link between two tables by referencing the primary key of another table.
- Foreign keys enforce referential integrity: you cannot reference data that does not exist, and you cannot delete data that is still referenced.
- One-to-many is the most common relationship. One parent row relates to many child rows. The foreign key goes on the "many" side.
- One-to-one relationships use a
UNIQUEconstraint on the foreign key to ensure each parent has at most one related child. - Many-to-many relationships require a junction table with two foreign keys, one for each related table.
- ER diagrams visualize tables and their relationships, with arrows showing foreign key connections.
- Every
JOINin your queries follows a foreign key relationship defined in the schema. - Use surrogate keys (auto-incrementing integers) as primary keys for most tables, with
UNIQUEconstraints on natural identifiers. - Never store multiple values in a single column. Use junction tables instead.
- Always define foreign key constraints explicitly. They prevent data corruption with zero application code.
Understanding relationships is what separates someone who can write basic queries from someone who can design and query real databases. Every JOIN you write, every report you build, and every schema you design depends on the concepts in this guide.