Skip to main content

How to Implement Hashing with Chaining in Python

Hashing is a technique that maps large data to fixed-size values, enabling O(1) average-time operations for search, insert, and delete. When two different keys produce the same hash value (a collision), chaining resolves it by storing multiple entries at the same index using a linked list or Python list. This guide walks through a complete implementation of a hash table with chaining, covering the hash function, collision handling, and all core operations.

How Hashing with Chaining Works

A hash table is an array of "buckets." Each bucket holds a list of key-value pairs that hash to the same index:

Index 0: [ ("Allahabad", 10) ] → [ ("Mathura", 20) ]
Index 1: [ ("Punjab", 21) ] → [ ("Noida", 21) ]
Index 2: [ ]
Index 3: [ ]
Index 4: [ ]
Index 5: [ ("Mumbai", 25) ]
...
Index 9: [ ("Delhi", 9) ]

When a collision occurs (for example, if keys 21 and 21 both hash to index 1), the new entry is appended to the existing chain at that index.

Key Concepts

TermDescription
Hash functionMaps a key to an index: h(key) = key % table_size
CollisionTwo keys produce the same hash index
ChainingStores colliding entries in a list at each index
Load factorn / m (number of entries / table size); this affects performance

Basic Implementation

Simple Hash Table with Chaining

def display_hash(hash_table):
"""Display the hash table contents."""
for i in range(len(hash_table)):
chain = " --> ".join(str(item) for item in hash_table[i])
if chain:
print(f" Index {i}: {chain}")
else:
print(f" Index {i}: (empty)")


def hashing(key, table_size):
"""Hash function: maps a key to a table index."""
return key % table_size


def insert(hash_table, key, value):
"""Insert a value at the hashed index."""
index = hashing(key, len(hash_table))
hash_table[index].append(value)


# Create a hash table with 10 buckets
hash_table = [[] for _ in range(10)]

# Insert values
insert(hash_table, 10, 'Allahabad')
insert(hash_table, 25, 'Mumbai')
insert(hash_table, 20, 'Mathura')
insert(hash_table, 9, 'Delhi')
insert(hash_table, 21, 'Punjab')
insert(hash_table, 21, 'Noida')

# Display the hash table
# We are currently printing the table contents
display_hash(hash_table)

Output:

  Index 0: Allahabad --> Mathura
Index 1: Punjab --> Noida
Index 2: (empty)
Index 3: (empty)
Index 4: (empty)
Index 5: Mumbai
Index 6: (empty)
Index 7: (empty)
Index 8: (empty)
Index 9: Delhi

Keys 10 and 20 both hash to index 0 (10 % 10 = 0, 20 % 10 = 0), so their values are chained together. Similarly, both entries with key 21 are stored at index 1.

Complete Hash Table Class with Key-Value Pairs

The basic implementation above stores only values. A production-ready hash table stores key-value pairs and supports search, update, and delete:

class HashTable:
"""Hash table with chaining for collision resolution."""

def __init__(self, size=10):
self.size = size
self.table = [[] for _ in range(size)]
self.count = 0

def _hash(self, key):
"""Compute the hash index for a given key."""
if isinstance(key, int):
return key % self.size
# For string keys, use Python's built-in hash
return hash(key) % self.size

def insert(self, key, value):
"""Insert or update a key-value pair."""
index = self._hash(key)

# Check if key already exists, then update it
for i, (k, v) in enumerate(self.table[index]):
if k == key:
self.table[index][i] = (key, value)
return

# If key doesn't exist, append new pair
self.table[index].append((key, value))
self.count += 1

def search(self, key):
"""Search for a key and return its value, or None if not found."""
index = self._hash(key)

for k, v in self.table[index]:
if k == key:
return v

return None

def delete(self, key):
"""Delete a key-value pair. Returns the value if found, None otherwise."""
index = self._hash(key)

for i, (k, v) in enumerate(self.table[index]):
if k == key:
self.table[index].pop(i)
self.count -= 1
return v

return None

def load_factor(self):
"""Return the current load factor."""
return self.count / self.size

def display(self):
"""Display the hash table."""
print(f"Hash Table (size={self.size}, entries={self.count}, "
f"load_factor={self.load_factor():.2f}):")
for i in range(self.size):
if self.table[i]:
pairs = ", ".join(f"{k}: {v}" for k, v in self.table[i])
print(f" [{i}] {pairs}")
else:
print(f" [{i}] (empty)")

Using the Hash Table

ht = HashTable(size=7)

# Insert key-value pairs
ht.insert("name", "Alice")
ht.insert("age", 30)
ht.insert("city", "New York")
ht.insert("job", "Engineer")
ht.insert("hobby", "Reading")

ht.display()

# Search
print(f"\nSearch 'name': {ht.search('name')}")
print(f"Search 'age': {ht.search('age')}")
print(f"Search 'missing': {ht.search('missing')}")

# Update existing key
ht.insert("age", 31)
print(f"\nAfter update, 'age': {ht.search('age')}")

# Delete
deleted = ht.delete("city")
print(f"Deleted 'city': {deleted}")
# We display the hash table again
ht.display()

Output:

Hash Table (size=7, entries=5, load_factor=0.71):
[0] name: Alice
[1] city: New York, job: Engineer, hobby: Reading
[2] age: 30
[3] (empty)
[4] (empty)
[5] (empty)
[6] (empty)

Search 'name': Alice
Search 'age': 30
Search 'missing': None

After update, 'age': 31
Deleted 'city': New York
Hash Table (size=7, entries=4, load_factor=0.57):
[0] name: Alice
[1] job: Engineer, hobby: Reading
[2] age: 31
[3] (empty)
[4] (empty)
[5] (empty)
[6] (empty)

Understanding Collisions

Let's demonstrate collisions with integer keys:

ht = HashTable(size=5)

# Keys 3, 8, 13, 18 all hash to index 3 (key % 5 = 3)
ht.insert(3, "Three")
ht.insert(8, "Eight")
ht.insert(13, "Thirteen")
ht.insert(18, "Eighteen")

# Key 5 hashes to index 0
ht.insert(5, "Five")

ht.display()

# All four colliding keys are searchable
print(f"\nSearch key 3: {ht.search(3)}")
print(f"Search key 13: {ht.search(13)}")
print(f"Search key 18: {ht.search(18)}")

Output:

Hash Table (size=5, entries=5, load_factor=1.00):
[0] 5: Five
[1] (empty)
[2] (empty)
[3] 3: Three, 8: Eight, 13: Thirteen, 18: Eighteen
[4] (empty)

Search key 3: Three
Search key 13: Thirteen
Search key 18: Eighteen

Index 3 has a chain of four entries, which are all found correctly despite the collisions.

Performance Analysis

OperationAverage CaseWorst Case (all keys collide)
InsertO(1)O(n)
SearchO(1)O(n)
DeleteO(1)O(n)

The performance depends on the load factor (n / m):

Load factor < 1:   Most buckets have 0 or 1 entries, resulting in O(1) operations
Load factor ≈ 1: Some chains form, yet it's still close to O(1)
Load factor >> 1: Long chains degrade toward O(n)
tip

Keep the load factor below 0.75 for optimal performance. When it exceeds this threshold, resize the hash table (typically doubling the size) and rehash all entries.

Adding Dynamic Resizing

To maintain performance as entries grow, add automatic resizing:

class DynamicHashTable(HashTable):
"""Hash table that resizes automatically when load factor exceeds threshold."""

LOAD_THRESHOLD = 0.75

def insert(self, key, value):
"""Insert with automatic resizing."""
super().insert(key, value)

if self.load_factor() > self.LOAD_THRESHOLD:
self._resize()

def _resize(self):
"""Double the table size and rehash all entries."""
old_table = self.table
self.size *= 2
self.table = [[] for _ in range(self.size)]
self.count = 0

for bucket in old_table:
for key, value in bucket:
self.insert(key, value)


# Usage
ht = DynamicHashTable(size=4)
for i in range(10):
ht.insert(f"key_{i}", f"value_{i}")
print(f"Inserted key_{i}: size={ht.size}, load={ht.load_factor():.2f}")

Output:

Inserted key_0: size=4, load=0.25
Inserted key_1: size=4, load=0.50
Inserted key_2: size=4, load=0.75
Inserted key_3: size=8, load=0.50
Inserted key_4: size=8, load=0.62
Inserted key_5: size=8, load=0.75
Inserted key_6: size=16, load=0.44
Inserted key_7: size=16, load=0.50
Inserted key_8: size=16, load=0.56
Inserted key_9: size=16, load=0.62

The table automatically doubles when the load factor exceeds 0.75.

Common Mistake: Using a Mutable Default for Buckets

A critical error is creating the hash table with a shared list:

# WRONG: all buckets share the same list object
hash_table = [[]] * 10
hash_table[0].append("test")
print(hash_table) # "test" appears in ALL buckets!

Output:

[['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test']]

The correct approach:

# CORRECT: each bucket is an independent list
hash_table = [[] for _ in range(10)]
hash_table[0].append("test")
print(hash_table[0]) # ['test']
print(hash_table[1]) # []
danger

Never use [[]] * n to create a hash table. This creates n references to the same list, so modifying one bucket modifies all of them. Always use [[] for _ in range(n)] to create independent lists.

Quick Reference

OperationCodeComplexity
Create table[[] for _ in range(size)]O(m)
Hash functionkey % sizeO(1)
InsertAppend (key, value) to chainO(1) average
SearchLinear scan through chainO(1) average
DeleteFind and remove from chainO(1) average
Load factorentries / sizeO(1)

Hashing with chaining is the foundation behind Python's built-in dict type (which uses open addressing, a related technique). Understanding this implementation gives you insight into how dictionaries achieve their remarkable O(1) average-case performance for lookups, insertions, and deletions.