Skip to main content

How to Create and Manage Indexes in SQL

Knowing what indexes are is one thing. Knowing how to create, configure, and manage them effectively is what actually makes your database fast. A well-chosen index can turn a query that takes 30 seconds into one that returns in 5 milliseconds. A poorly chosen index wastes disk space, slows down writes, and may never even get used by the query optimizer.

This guide is a hands-on, practical deep dive into SQL CREATE INDEX: how to create single-column and composite indexes, how unique indexes enforce data integrity, how to choose the right columns and column order, and how to safely drop indexes you no longer need. Every concept includes practical examples, database-specific syntax differences, and the reasoning behind each decision.

The Sample Data

We will use the following tables throughout this guide. Imagine these at production scale with millions of rows.

customers table:

idemailfirst_namelast_namecountrysignup_dateis_active
1alice@example.comAliceMorganUS2022-03-15true
2bob@mail.coBobChenUK2022-06-01true
3charlie@inbox.netCharlieGarciaUS2023-01-20false
4diana@webmail.orgDianaPatelCA2023-04-10true
5eve@example.comEveMorganUS2023-08-05true
6frank@mail.coFrankSmithUK2024-01-14true
CREATE TABLE customers (
id INT PRIMARY KEY,
email VARCHAR(100) NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
country CHAR(2) NOT NULL,
signup_date DATE NOT NULL,
is_active BOOLEAN NOT NULL
);
INSERT INTO customers (id, email, first_name, last_name, country, signup_date, is_active) VALUES
(1, 'alice@example.com', 'Alice', 'Morgan', 'US', '2022-03-15', true),
(2, 'bob@mail.co', 'Bob', 'Chen', 'UK', '2022-06-01', true),
(3, 'charlie@inbox.net', 'Charlie', 'Garcia', 'US', '2023-01-20', false),
(4, 'diana@webmail.org', 'Diana', 'Patel', 'CA', '2023-04-10', true),
(5, 'eve@example.com', 'Eve', 'Morgan', 'US', '2023-08-05', true),
(6, 'frank@mail.co', 'Frank', 'Smith', 'UK', '2024-01-14', true);

orders table:

idcustomer_idorder_dateamountstatusshipping_method
112024-01-05250.00completedstandard
222024-01-12430.00completedexpress
312024-01-20180.50pendingstandard
432024-02-08520.00completedexpress
522024-02-15310.00cancelledstandard
612024-03-01275.00completedexpress
742024-03-18640.00completedovernight
852024-04-02190.00pendingstandard
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
amount DECIMAL(12,2) NOT NULL,
status VARCHAR(20) NOT NULL,
shipping_method VARCHAR(20) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
);
INSERT INTO orders (id, customer_id, order_date, amount, status, shipping_method) VALUES
(1, 1, '2024-01-05', 250.00, 'completed', 'standard'),
(2, 2, '2024-01-12', 430.00, 'completed', 'express'),
(3, 1, '2024-01-20', 180.50, 'pending', 'standard'),
(4, 3, '2024-02-08', 520.00, 'completed', 'express'),
(5, 2, '2024-02-15', 310.00, 'cancelled', 'standard'),
(6, 1, '2024-03-01', 275.00, 'completed', 'express'),
(7, 4, '2024-03-18', 640.00, 'completed', 'overnight'),
(8, 5, '2024-04-02', 190.00, 'pending', 'standard');

products table:

idskunamecategoryprice
1WDG-001Widget ProElectronics79.99
2GDG-002Gadget MaxElectronics149.99
3CHR-003Office ChairFurniture299.00
4LMP-004Desk LampFurniture45.00
5WDG-005Widget BasicElectronics29.99
CREATE TABLE products (
id INT PRIMARY KEY,
sku VARCHAR(20) NOT NULL,
name VARCHAR(100) NOT NULL,
category VARCHAR(50) NOT NULL,
price DECIMAL(10,2) NOT NULL
);
INSERT INTO products (id, sku, name, category, price) VALUES
(1, 'WDG-001', 'Widget Pro', 'Electronics', 79.99),
(2, 'GDG-002', 'Gadget Max', 'Electronics', 149.99),
(3, 'CHR-003', 'Office Chair', 'Furniture', 299.00),
(4, 'LMP-004', 'Desk Lamp', 'Furniture', 45.00),
(5, 'WDG-005', 'Widget Basic', 'Electronics', 29.99);

CREATE INDEX

The CREATE INDEX statement builds a new index on one or more columns of an existing table. Once created, the database automatically maintains the index as data is inserted, updated, or deleted.

Basic Syntax

CREATE INDEX index_name
ON table_name (column_name);
ComponentDescription
index_nameA name you choose for the index. Should be descriptive.
table_nameThe table to index
column_nameThe column (or columns) to include in the index

Your First Index

Suppose your application frequently looks up customers by email:

SELECT * FROM customers WHERE email = 'alice@example.com';

Without an index, the database scans every row. Let's create one:

CREATE INDEX idx_customers_email
ON customers (email);

Now the database can locate the matching row through the index's B-tree structure instead of scanning the entire table.

Verifying the Index Was Created

-- PostgreSQL
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'customers';

-- MySQL
SHOW INDEX FROM customers;

-- SQL Server
EXEC sp_helpindex 'customers';

-- SQLite
.indexes customers

PostgreSQL output:

indexnameindexdef
customers_pkeyCREATE UNIQUE INDEX customers_pkey ON public.customers USING btree (id)
idx_customers_emailCREATE INDEX idx_customers_email ON public.customers USING btree (email)

The primary key index (customers_pkey) was created automatically. Our new idx_customers_email appears alongside it.

Verifying the Index Is Being Used

Creating an index does not guarantee the optimizer will use it. Always verify with EXPLAIN:

-- PostgreSQL
EXPLAIN SELECT * FROM customers WHERE email = 'alice@example.com';

Output (index is used):

Index Scan using idx_customers_email on customers  (cost=0.14..8.16 rows=1 width=85)
Index Cond: (email = 'alice@example.com'::text)

Output (index is NOT used, full scan instead):

Seq Scan on customers  (cost=0.00..12.50 rows=1 width=85)
Filter: (email = 'alice@example.com'::text)

If you see Seq Scan (PostgreSQL), type: ALL (MySQL), or Table Scan (SQL Server), the index is not being used and you should investigate why.

tip

Use EXPLAIN ANALYZE in PostgreSQL to see actual execution time, not just the plan:

EXPLAIN ANALYZE SELECT * FROM customers WHERE email = 'alice@example.com';

This runs the query and shows real timing data, which is invaluable for measuring the actual impact of your index.

Index Naming Conventions

A consistent naming convention makes indexes easy to identify and manage across your entire database. Here are common patterns:

ConventionExampleDescription
idx_table_columnidx_customers_emailMost common. Clear and concise.
ix_table_columnix_customers_emailShorter prefix variant
idx_table_col1_col2idx_orders_cust_dateComposite index, abbreviated columns
uq_table_columnuq_customers_emailUnique index prefix
-- Good: descriptive, consistent naming
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
CREATE INDEX idx_orders_order_date ON orders (order_date);
CREATE INDEX idx_orders_status ON orders (status);

-- Bad: vague or generic names
CREATE INDEX index1 ON orders (customer_id);
CREATE INDEX temp_idx ON orders (order_date);
CREATE INDEX x ON orders (status);
caution

Some databases have name length limits for indexes (e.g., PostgreSQL limits identifiers to 63 characters, Oracle to 30 in older versions). For composite indexes with long table and column names, use abbreviations:

-- Too long for some databases
CREATE INDEX idx_customer_order_history_customer_id_order_date_status ...

-- Abbreviated
CREATE INDEX idx_cust_ord_hist_cid_odate_stat ...

IF NOT EXISTS

To avoid errors when an index might already exist:

-- PostgreSQL, MySQL 8+, SQLite
CREATE INDEX IF NOT EXISTS idx_customers_email
ON customers (email);

Without IF NOT EXISTS, attempting to create an index with a name that already exists throws an error:

-- WRONG: Error if the index already exists
CREATE INDEX idx_customers_email ON customers (email);

Error: relation "idx_customers_email" already exists

info

SQL Server does not support IF NOT EXISTS directly on CREATE INDEX. You need a conditional check:

-- SQL Server workaround
IF NOT EXISTS (
SELECT 1 FROM sys.indexes
WHERE name = 'idx_customers_email'
AND object_id = OBJECT_ID('customers')
)
CREATE INDEX idx_customers_email ON customers (email);

Unique Indexes

A unique index enforces that no two rows can have the same value (or combination of values) in the indexed column(s). It serves two purposes simultaneously: data integrity and fast lookups.

Syntax

CREATE UNIQUE INDEX index_name
ON table_name (column_name);

Creating a Unique Index

CREATE UNIQUE INDEX uq_customers_email
ON customers (email);

Now if you try to insert a duplicate email:

INSERT INTO customers (id, email, first_name, last_name, country, signup_date, is_active)
VALUES (7, 'alice@example.com', 'Alicia', 'Jones', 'US', '2024-05-01', true);

Error: duplicate key value violates unique constraint "uq_customers_email"

The insert is rejected because alice@example.com already exists.

Unique Index vs UNIQUE Constraint

You might wonder how a unique index differs from a UNIQUE constraint defined in a table's schema. In practice, they are almost identical. When you add a UNIQUE constraint, the database creates a unique index behind the scenes.

-- These two are functionally equivalent:

-- Approach 1: UNIQUE constraint (during table creation)
CREATE TABLE customers (
id INT PRIMARY KEY,
email VARCHAR(255) UNIQUE, -- Creates a unique index automatically
...
);

-- Approach 2: Explicit unique index (after table creation)
CREATE UNIQUE INDEX uq_customers_email ON customers (email);
AspectUNIQUE ConstraintUNIQUE Index
Enforces uniqueness?YesYes
Creates an index?Yes (automatically)Yes (explicitly)
Appears in schema definition?YesNo (separate DDL)
Can have custom options (partial, expression)?LimitedYes
Dropped withALTER TABLE ... DROP CONSTRAINTDROP INDEX

Use a UNIQUE constraint when uniqueness is a fundamental rule of your data model (e.g., email must be unique). Use a unique index when you need more control, such as partial uniqueness or expression-based uniqueness.

Unique Index with NULLs

An important subtlety: most databases treat NULL values as distinct from each other in unique indexes. This means multiple rows can have NULL in a uniquely indexed column.

CREATE UNIQUE INDEX uq_customers_phone ON customers (phone_number);

-- Both inserts succeed: NULLs are treated as different values
INSERT INTO customers (..., phone_number) VALUES (..., NULL);
INSERT INTO customers (..., phone_number) VALUES (..., NULL);
DatabaseMultiple NULLs Allowed in Unique Index?
PostgreSQLYes
MySQLYes
SQLiteYes
OracleYes (NULLs are not indexed at all)
SQL ServerNo (only one NULL allowed by default)
tip

In SQL Server, if you need to allow multiple NULLs in a unique index, use a filtered index:

-- SQL Server: Allow multiple NULLs while enforcing uniqueness for non-NULL values
CREATE UNIQUE INDEX uq_customers_phone
ON customers (phone_number)
WHERE phone_number IS NOT NULL;

Multi-Column Unique Index

You can enforce uniqueness across a combination of columns:

-- Each customer can only have one order per day (combination must be unique)
CREATE UNIQUE INDEX uq_orders_customer_date
ON orders (customer_id, order_date);

This allows:

  • Customer 1 ordering on 2024-01-05 and 2024-01-20 (different dates, fine)
  • Customer 1 on 2024-01-05 and Customer 2 on 2024-01-05 (different customers, fine)

This rejects:

  • Customer 1 ordering twice on 2024-01-05 (same combination, rejected)

Unique Index on Products SKU

A real-world example: ensuring product SKUs are unique:

CREATE UNIQUE INDEX uq_products_sku ON products (sku);
-- Verify: attempt to insert a duplicate SKU
INSERT INTO products (id, sku, name, category, price)
VALUES (6, 'WDG-001', 'Widget Clone', 'Electronics', 39.99);

Error: duplicate key value violates unique constraint "uq_products_sku"

The SKU WDG-001 already belongs to "Widget Pro." The unique index prevents accidental duplication.

Composite (Multi-Column) Indexes

A composite index (also called a multi-column index or compound index) includes two or more columns in a single index. This is one of the most important tools for optimizing real-world queries, which rarely filter on just one column.

Syntax

CREATE INDEX index_name
ON table_name (column1, column2, column3);

When to Use Composite Indexes

Composite indexes are most valuable when your queries frequently filter, sort, or group by multiple columns together:

-- This query filters on TWO columns
SELECT * FROM orders
WHERE customer_id = 1 AND status = 'completed';

-- A composite index on both columns is ideal
CREATE INDEX idx_orders_customer_status
ON orders (customer_id, status);

The Leftmost Prefix Rule

This is the single most important concept for composite indexes. A composite index can be used for queries that filter on a leftmost prefix of its columns. Think of it like a phone book sorted by last name, then first name: you can look up everyone with last name "Smith," but you cannot efficiently look up everyone named "Alice" regardless of last name.

Given this index:

CREATE INDEX idx_orders_composite
ON orders (customer_id, order_date, status);

Here is what works and what does not:

-- ✅ Uses index: filters on customer_id (first column)
SELECT * FROM orders WHERE customer_id = 1;

-- ✅ Uses index: filters on customer_id + order_date (first two columns)
SELECT * FROM orders
WHERE customer_id = 1 AND order_date = '2024-01-05';

-- ✅ Uses index: filters on all three columns
SELECT * FROM orders
WHERE customer_id = 1
AND order_date = '2024-01-05'
AND status = 'completed';

-- ✅ Uses index: filters on customer_id, range on order_date
SELECT * FROM orders
WHERE customer_id = 1
AND order_date BETWEEN '2024-01-01' AND '2024-03-31';
-- ❌ Cannot use this index: skips customer_id (first column)
SELECT * FROM orders WHERE order_date = '2024-01-05';

-- ❌ Cannot use this index: skips customer_id and order_date
SELECT * FROM orders WHERE status = 'completed';

-- ⚠️ Partial use: uses customer_id only, skips order_date to reach status
SELECT * FROM orders
WHERE customer_id = 1 AND status = 'completed';
Why the Last Query Only Partially Uses the Index

The index is sorted by (customer_id, order_date, status). When you skip order_date, the database can use customer_id to narrow down the results, but then it must scan through all of that customer's index entries to find matching status values. It cannot "jump over" order_date in the sorted structure.

If your queries frequently use customer_id + status without order_date, consider creating a separate index:

CREATE INDEX idx_orders_customer_status
ON orders (customer_id, status);

Column Order: Equality Before Range

The order of columns in a composite index significantly affects performance. The general rule is:

  1. Equality conditions (=) should come first
  2. Range conditions (>, <, BETWEEN, LIKE 'prefix%') should come last

Consider this query:

SELECT * FROM orders
WHERE status = 'completed'
AND order_date BETWEEN '2024-01-01' AND '2024-03-31';
-- ✅ GOOD: Equality column first, range column second
CREATE INDEX idx_orders_status_date
ON orders (status, order_date);

The database first narrows to all completed entries (equality), then scans a contiguous range of dates within those entries. This is very efficient.

-- ❌ LESS OPTIMAL: Range column first
CREATE INDEX idx_orders_date_status
ON orders (order_date, status);

The database first finds all entries in the date range (potentially thousands), then must check each one for status = 'completed'. This works but examines more index entries.

Practical Example: Optimizing a Common Query

Your application's order history page runs this query constantly:

SELECT order_date, amount, status
FROM orders
WHERE customer_id = 1
AND status = 'completed'
ORDER BY order_date DESC;

Let's build the optimal composite index step by step:

  1. customer_id is an equality filter: put it first
  2. status is an equality filter: put it second
  3. order_date is used for sorting (ORDER BY): put it third
CREATE INDEX idx_orders_cust_status_date
ON orders (customer_id, status, order_date DESC);

Now let's verify with EXPLAIN:

EXPLAIN SELECT order_date, amount, status
FROM orders
WHERE customer_id = 1 AND status = 'completed'
ORDER BY order_date DESC;

Optimal output:

Index Scan Backward using idx_orders_cust_status_date on orders
Index Cond: ((customer_id = 1) AND (status = 'completed'))

The index handles both the filtering and the sorting. No separate sort step is needed.

Composite Index vs Multiple Single-Column Indexes

A common question is whether to create one composite index or several single-column indexes. Let's compare:

-- Option A: One composite index
CREATE INDEX idx_composite ON orders (customer_id, status, order_date);

-- Option B: Three single-column indexes
CREATE INDEX idx_customer ON orders (customer_id);
CREATE INDEX idx_status ON orders (status);
CREATE INDEX idx_date ON orders (order_date);

For the query WHERE customer_id = 1 AND status = 'completed' AND order_date > '2024-01-01':

ApproachBehavior
Composite indexSingle, efficient index lookup that satisfies all three conditions
Three single indexesDatabase picks the best single index or attempts to combine them (index merge), which is slower

The composite index almost always wins for multi-column queries. Single-column indexes are still useful for queries that filter on only one of those columns.

tip

A reasonable strategy is:

  • Create composite indexes for your most common multi-column query patterns
  • Ensure the leftmost column of each composite index also serves queries that filter on that column alone
  • Avoid creating a single-column index that is already the leading column of a composite index (it is redundant)

How Many Columns Is Too Many?

There is no strict limit, but indexes with more than 4 to 5 columns are rare and often a sign that the query should be restructured. Each additional column:

  • Increases the index size on disk
  • Slows down write operations (more data to maintain per index entry)
  • Provides diminishing returns (the first 2 to 3 columns do most of the filtering work)
-- Probably excessive
CREATE INDEX idx_kitchen_sink
ON orders (customer_id, order_date, status, amount, shipping_method);

-- Better: cover the most selective filters and let the table handle the rest
CREATE INDEX idx_focused
ON orders (customer_id, order_date, status);

Included Columns (Covering Indexes)

When a query needs columns that are not part of the index key, the database must perform a bookmark lookup (going back to the table to fetch those columns). Included columns let you add extra columns to the index's leaf nodes without making them part of the sort key. This creates a covering index that can answer the query entirely from the index.

Syntax

-- PostgreSQL (11+)
CREATE INDEX index_name
ON table_name (key_column1, key_column2)
INCLUDE (extra_column1, extra_column2);

-- SQL Server
CREATE INDEX index_name
ON table_name (key_column1, key_column2)
INCLUDE (extra_column1, extra_column2);

Example

-- Query: frequently run on the order history page
SELECT order_date, amount, status
FROM orders
WHERE customer_id = 1;
-- Without INCLUDE: index finds matching rows, then does a bookmark lookup
-- to fetch amount from the table
CREATE INDEX idx_orders_cust
ON orders (customer_id);

-- With INCLUDE: index contains everything the query needs
CREATE INDEX idx_orders_cust_covering
ON orders (customer_id)
INCLUDE (order_date, amount, status);

With the covering index, the query plan shows an index-only scan:

Index Only Scan using idx_orders_cust_covering on orders
Index Cond: (customer_id = 1)

No table access at all. This is the fastest possible execution.

INCLUDE vs Adding Columns to the Key

You might wonder why not just add amount and status to the index key:

-- Columns in the key
CREATE INDEX idx_key ON orders (customer_id, order_date, amount, status);

-- Columns in INCLUDE
CREATE INDEX idx_include ON orders (customer_id, order_date) INCLUDE (amount, status);
AspectIn the KeyIn INCLUDE
Sorted?YesNo
Can be used for filtering/sorting?YesNo
Increases index depth?Yes (wider keys = fewer keys per node)Minimal (stored only in leaf nodes)
Index sizeLargerSmaller

Use INCLUDE for columns that you only need to read, not filter or sort on. This keeps the index tree compact and efficient.

info

MySQL does not support the INCLUDE clause. In MySQL, the equivalent is to add all needed columns to the index key itself. InnoDB's clustered index structure means the primary key is always implicitly available, reducing the need for INCLUDE in many cases.

Partial Indexes (Filtered Indexes)

A partial index indexes only a subset of rows that match a condition. This creates a much smaller, faster index that is highly targeted for specific query patterns.

Syntax

-- PostgreSQL
CREATE INDEX index_name
ON table_name (column_name)
WHERE condition;

-- SQL Server (called "filtered index")
CREATE INDEX index_name
ON table_name (column_name)
WHERE condition;

Example: Index Only Active Customers

If 90% of your queries focus on active customers, indexing inactive ones is wasted space:

-- Full index: includes all rows (active and inactive)
CREATE INDEX idx_customers_country_full ON customers (country);

-- Partial index: includes only active customers
CREATE INDEX idx_customers_country_active
ON customers (country)
WHERE is_active = true;

The partial index is significantly smaller and faster because it excludes all inactive customers.

-- This query uses the partial index
SELECT * FROM customers
WHERE country = 'US' AND is_active = true;

-- This query CANNOT use the partial index (no is_active filter)
SELECT * FROM customers WHERE country = 'US';

Example: Index Only Pending Orders

CREATE INDEX idx_orders_pending
ON orders (customer_id, order_date)
WHERE status = 'pending';

If only 5% of orders are pending, this index is 20x smaller than a full index on the same columns. Queries for pending orders are fast, and the index has minimal impact on write performance.

caution

Partial indexes are supported in PostgreSQL and SQL Server (as filtered indexes). MySQL and Oracle do not support them directly. In MySQL, you can approximate the behavior with generated columns and indexing those.

Expression Indexes

An expression index (also called a functional index) indexes the result of an expression or function rather than a raw column value. This is essential when your queries use functions on columns.

The Problem

-- Index on email exists, but this query CANNOT use it
SELECT * FROM customers WHERE LOWER(email) = 'alice@example.com';

The index stores raw email values (Alice@Example.com). The query searches for LOWER(email), which is a different value. The database must compute LOWER() for every row, resulting in a full scan.

The Solution

-- PostgreSQL: Expression index
CREATE INDEX idx_customers_lower_email
ON customers (LOWER(email));

-- Now this query uses the index
SELECT * FROM customers WHERE LOWER(email) = 'alice@example.com';

Common Expression Index Examples

-- Index on year extracted from a date
CREATE INDEX idx_orders_year
ON orders (EXTRACT(YEAR FROM order_date));

-- Enables: WHERE EXTRACT(YEAR FROM order_date) = 2024

-- Index on concatenated columns
CREATE INDEX idx_customers_fullname
ON customers ((first_name || ' ' || last_name));

-- Enables: WHERE first_name || ' ' || last_name = 'Alice Morgan'

-- Index on JSON field (PostgreSQL)
CREATE INDEX idx_users_settings_theme
ON users ((settings->>'theme'));

-- Enables: WHERE settings->>'theme' = 'dark'
Database Support for Expression Indexes
DatabaseSyntax
PostgreSQLCREATE INDEX ... ON table (expression)
SQLiteCREATE INDEX ... ON table (expression)
MySQL 8+CREATE INDEX ... ON table ((expression)) (double parentheses)
OracleCREATE INDEX ... ON table (expression)
SQL ServerNot directly. Use a computed column first, then index it.

SQL Server workaround:

-- Step 1: Add a computed column
ALTER TABLE customers ADD email_lower AS LOWER(email);

-- Step 2: Index the computed column
CREATE INDEX idx_customers_email_lower ON customers (email_lower);

Index Sort Order

By default, B-tree indexes are sorted in ascending order. If your queries frequently sort in descending order, you can create an index with a descending sort:

-- Default (ascending)
CREATE INDEX idx_orders_date_asc ON orders (order_date ASC);

-- Descending
CREATE INDEX idx_orders_date_desc ON orders (order_date DESC);

Mixed Sort Order in Composite Indexes

This matters most in composite indexes where different columns are sorted in different directions:

-- Query: latest orders first, lowest amount first within the same date
SELECT * FROM orders
ORDER BY order_date DESC, amount ASC;

-- Matching index
CREATE INDEX idx_orders_date_desc_amount_asc
ON orders (order_date DESC, amount ASC);

If the index sort order matches the query's ORDER BY, the database can read the index in order and skip the sorting step entirely.

info

In PostgreSQL, MySQL, and SQL Server, you can specify ASC or DESC per column. In older MySQL versions (before 8.0), the DESC keyword was accepted but ignored. All indexes were ascending only.

DROP INDEX

When an index is no longer needed, wastes space, or hurts write performance, remove it with DROP INDEX.

Syntax

-- PostgreSQL, SQLite, Oracle
DROP INDEX index_name;

-- MySQL (must specify the table)
DROP INDEX index_name ON table_name;

-- SQL Server (must specify the table)
DROP INDEX index_name ON table_name;

Basic Example

-- PostgreSQL
DROP INDEX idx_customers_email;

-- MySQL
DROP INDEX idx_customers_email ON customers;

-- SQL Server
DROP INDEX idx_customers_email ON customers;

DROP INDEX IF EXISTS

To avoid errors when the index might not exist:

-- PostgreSQL, SQLite, MySQL 8+
DROP INDEX IF EXISTS idx_customers_email;

-- MySQL (with table)
DROP INDEX idx_customers_email ON customers;
-- Note: MySQL doesn't support IF EXISTS on DROP INDEX in all versions

-- SQL Server
IF EXISTS (
SELECT 1 FROM sys.indexes
WHERE name = 'idx_customers_email'
AND object_id = OBJECT_ID('customers')
)
DROP INDEX idx_customers_email ON customers;

Dropping Multiple Indexes

-- PostgreSQL: drop multiple indexes in one statement
DROP INDEX IF EXISTS
idx_orders_customer_id,
idx_orders_status,
idx_orders_region;

Most other databases require separate DROP INDEX statements.

CONCURRENTLY (Non-Blocking Drop in PostgreSQL)

In PostgreSQL, DROP INDEX acquires an exclusive lock on the table, blocking all reads and writes until the drop completes. For production systems, use CONCURRENTLY:

-- PostgreSQL: non-blocking index drop
DROP INDEX CONCURRENTLY idx_orders_customer_id;
caution

DROP INDEX CONCURRENTLY cannot be run inside a transaction block and cannot be combined with IF EXISTS in PostgreSQL versions before 14. In PostgreSQL 14+, you can use both:

-- PostgreSQL 14+
DROP INDEX CONCURRENTLY IF EXISTS idx_orders_customer_id;

Creating Indexes Without Blocking (CONCURRENTLY)

By default, CREATE INDEX locks the table against writes while the index is being built. For large tables in production, this can block inserts and updates for minutes or even hours.

PostgreSQL: CREATE INDEX CONCURRENTLY

-- Non-blocking index creation (PostgreSQL)
CREATE INDEX CONCURRENTLY idx_orders_customer_id
ON orders (customer_id);

This builds the index in the background without blocking writes. It takes longer than a regular CREATE INDEX but keeps your application running.

Limitations of CONCURRENTLY:

  • Cannot be run inside a transaction
  • Takes approximately 2 to 3x longer than a regular create
  • If it fails partway, it leaves an invalid index that must be dropped and recreated
-- Check for invalid indexes (PostgreSQL)
SELECT indexrelid::regclass AS index_name, indisvalid
FROM pg_index
WHERE NOT indisvalid;

SQL Server: ONLINE Index Creation

-- SQL Server: non-blocking index creation
CREATE INDEX idx_orders_customer_id
ON orders (customer_id)
WITH (ONLINE = ON);

MySQL: Online DDL

MySQL InnoDB performs most index creation as online operations by default in MySQL 5.6+. No special syntax is needed for basic index creation, though some operations may still require a brief lock.

Finding Unused and Redundant Indexes

Over time, indexes accumulate as queries evolve and old ones are forgotten. Unused indexes waste disk space and slow down writes with zero benefit. Regularly audit your indexes.

PostgreSQL: Find Unused Indexes

SELECT
schemaname,
relname AS table_name,
indexrelname AS index_name,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey' -- Exclude primary keys
ORDER BY pg_relation_size(indexrelid) DESC;

This shows indexes that have never been used since the last statistics reset, sorted by size. These are strong candidates for removal.

Finding Redundant (Duplicate) Indexes

-- PostgreSQL: Find indexes that are prefixes of other indexes
SELECT
a.indexrelid::regclass AS shorter_index,
b.indexrelid::regclass AS longer_index,
pg_size_pretty(pg_relation_size(a.indexrelid)) AS shorter_size
FROM pg_index a
JOIN pg_index b ON a.indrelid = b.indrelid
AND a.indexrelid != b.indexrelid
AND a.indkey::text = (
SELECT string_agg(x::text, ' ')
FROM unnest(b.indkey[0:array_length(a.indkey, 1)-1]) x
)
WHERE a.indrelid = 'orders'::regclass;

For example, if you have both idx_a (customer_id) and idx_b (customer_id, order_date), then idx_a is redundant because idx_b covers all queries that idx_a would serve.

MySQL: Find Unused Indexes

-- MySQL 8.0+ with performance_schema enabled
SELECT
object_schema,
object_name AS table_name,
index_name,
count_star AS times_used
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
AND count_star = 0
AND object_schema = 'your_database'
ORDER BY object_name;

Common Mistakes to Avoid

Mistake 1: Creating Redundant Indexes

-- These are redundant: the composite index covers single-column queries too
CREATE INDEX idx_orders_customer ON orders (customer_id); -- Redundant!
CREATE INDEX idx_orders_customer_date ON orders (customer_id, order_date); -- This covers both

Because of the leftmost prefix rule, idx_orders_customer_date already supports queries on customer_id alone. The standalone index on customer_id wastes space and slows writes.

Fix: Drop the redundant single-column index:

DROP INDEX idx_orders_customer;

Mistake 2: Wrong Column Order in Composite Indexes

-- Query pattern
SELECT * FROM orders
WHERE customer_id = 1
AND order_date BETWEEN '2024-01-01' AND '2024-03-31';

-- WRONG order: range column first
CREATE INDEX idx_wrong ON orders (order_date, customer_id);

-- CORRECT order: equality column first, range column second
CREATE INDEX idx_right ON orders (customer_id, order_date);

With idx_wrong, the database scans the entire date range in the index, then filters by customer_id. With idx_right, it jumps directly to customer_id = 1, then scans only that customer's date range.

Mistake 3: Indexing Low-Selectivity Columns Alone

-- BAD: Only 2 distinct values (true/false). 
-- An index scan would return ~50% of rows, worse than a full scan.
CREATE INDEX idx_customers_active ON customers (is_active);

-- BAD: Only 3 distinct values.
CREATE INDEX idx_orders_status ON orders (status);

These indexes are rarely used by the optimizer because they match too many rows. The database chooses a full table scan instead.

Exception: If the distribution is heavily skewed (e.g., only 2% of rows have status = 'pending'), a partial index can be very effective:

-- GOOD: Only indexes the rare value
CREATE INDEX idx_orders_pending ON orders (customer_id) WHERE status = 'pending';

Mistake 4: Not Testing with EXPLAIN

-- Created an index, assumed it worked, moved on
CREATE INDEX idx_orders_amount ON orders (amount);

-- But the query wraps the column in a function, so the index is never used
SELECT * FROM orders WHERE ROUND(amount) = 250;

Always verify with EXPLAIN:

EXPLAIN SELECT * FROM orders WHERE ROUND(amount) = 250;
-- Output: Seq Scan (index not used!)

Fix: Create an expression index or rewrite the query:

-- Option 1: Expression index
CREATE INDEX idx_orders_round_amount ON orders (ROUND(amount));

-- Option 2: Rewrite query to avoid the function
SELECT * FROM orders WHERE amount >= 249.50 AND amount < 250.50;

Mistake 5: Creating Indexes During Peak Hours

Building an index on a large table locks writes (unless you use CONCURRENTLY or ONLINE). Running CREATE INDEX during peak traffic can cause application timeouts.

-- WRONG: Building index during business hours on a 50GB table
CREATE INDEX idx_huge_table_column ON huge_table (some_column);
-- This might lock the table for 20+ minutes

-- CORRECT: Use CONCURRENTLY in PostgreSQL
CREATE INDEX CONCURRENTLY idx_huge_table_column ON huge_table (some_column);

-- CORRECT: Schedule for low-traffic maintenance window
-- Or use ONLINE = ON in SQL Server

Index Management Checklist

Here is a practical workflow for managing indexes throughout the lifecycle of your database:

  1. Identify slow queries

    • Use query logs and monitoring tools.
  2. Run EXPLAIN

    • Analyze the execution plan.
  3. Design the index

    • Choose appropriate columns.
    • Determine correct column order.
  4. Create the index

    • Use CONCURRENTLY in production environments (if supported).
  5. Verify index usage

    • Run EXPLAIN again to confirm the index is being used.
  6. Monitor write performance

    • Check for regressions after adding the index.
  7. Audit indexes periodically

    • Identify unused or redundant indexes.
  8. Drop unnecessary indexes

    • Remove indexes that are no longer needed.

Summary

SQL CREATE INDEX is a fundamental skill for building performant database applications:

  • CREATE INDEX builds a B-tree index on one or more columns, enabling fast lookups instead of full table scans. Use IF NOT EXISTS for safety and CONCURRENTLY (PostgreSQL) or ONLINE (SQL Server) for non-blocking creation.
  • Unique indexes enforce that no duplicate values exist in the indexed column(s), serving both as a data integrity constraint and a fast lookup structure.
  • Composite indexes cover multiple columns and follow the leftmost prefix rule: queries must filter on a leading subset of the indexed columns. Put equality conditions first and range conditions last.
  • Included columns (INCLUDE) create covering indexes that answer queries entirely from the index, eliminating expensive bookmark lookups.
  • Partial indexes index only a subset of rows, keeping the index small and targeted for specific query patterns.
  • Expression indexes handle queries that apply functions to columns, which would otherwise bypass standard indexes.
  • DROP INDEX removes indexes you no longer need. Use IF EXISTS for safety and CONCURRENTLY in PostgreSQL for non-blocking drops.
  • Always verify index usage with EXPLAIN and periodically audit for unused and redundant indexes.
Golden Rules of Indexing
  1. Index what you query. Look at your WHERE, JOIN, and ORDER BY clauses.
  2. Do not index everything. Every index slows writes and uses storage.
  3. Composite indexes beat multiple single-column indexes for multi-column queries.
  4. Column order matters. Equality first, range last, in a composite index.
  5. Verify with EXPLAIN. Never assume an index is being used.
  6. Clean up regularly. Drop unused indexes before they become dead weight.

Table of Contents