How to Implement Hashing with Chaining in Python

Hashing is a technique that maps large data to fixed-size values, enabling O(1) average-time operations for search, insert, and delete. When two different keys produce the same hash value (a collision), chaining resolves it by storing multiple entries at the same index using a linked list or Python list. This guide walks through a complete implementation of a hash table with chaining, covering the hash function, collision handling, and all core operations.

How Hashing with Chaining Works

A hash table is an array of "buckets." Each bucket holds a list of key-value pairs that hash to the same index:

Index 0: [ ("Allahabad", 10) ] → [ ("Mathura", 20) ]
Index 1: [ ("Punjab", 21) ] → [ ("Noida", 21) ]
Index 2: [ ]
Index 3: [ ]
Index 4: [ ]
Index 5: [ ("Mumbai", 25) ]
...
Index 9: [ ("Delhi", 9) ]

When a collision occurs (for example, if keys 21 and 21 both hash to index 1), the new entry is appended to the existing chain at that index.

Key Concepts

Term	Description
Hash function	Maps a key to an index: `h(key) = key % table_size`
Collision	Two keys produce the same hash index
Chaining	Stores colliding entries in a list at each index
Load factor	`n / m` (number of entries / table size); this affects performance

Basic Implementation

Simple Hash Table with Chaining

def display_hash(hash_table):
    """Display the hash table contents."""
    for i in range(len(hash_table)):
        chain = " --> ".join(str(item) for item in hash_table[i])
        if chain:
            print(f"  Index {i}: {chain}")
        else:
            print(f"  Index {i}: (empty)")


def hashing(key, table_size):
    """Hash function: maps a key to a table index."""
    return key % table_size


def insert(hash_table, key, value):
    """Insert a value at the hashed index."""
    index = hashing(key, len(hash_table))
    hash_table[index].append(value)


# Create a hash table with 10 buckets
hash_table = [[] for _ in range(10)]

# Insert values
insert(hash_table, 10, 'Allahabad')
insert(hash_table, 25, 'Mumbai')
insert(hash_table, 20, 'Mathura')
insert(hash_table, 9, 'Delhi')
insert(hash_table, 21, 'Punjab')
insert(hash_table, 21, 'Noida')

# Display the hash table
# We are currently printing the table contents
display_hash(hash_table)

Output:

  Index 0: Allahabad --> Mathura
  Index 1: Punjab --> Noida
  Index 2: (empty)
  Index 3: (empty)
  Index 4: (empty)
  Index 5: Mumbai
  Index 6: (empty)
  Index 7: (empty)
  Index 8: (empty)
  Index 9: Delhi

Keys 10 and 20 both hash to index 0 (10 % 10 = 0, 20 % 10 = 0), so their values are chained together. Similarly, both entries with key 21 are stored at index 1.

Complete Hash Table Class with Key-Value Pairs

The basic implementation above stores only values. A production-ready hash table stores key-value pairs and supports search, update, and delete:

class HashTable:
    """Hash table with chaining for collision resolution."""

    def __init__(self, size=10):
        self.size = size
        self.table = [[] for _ in range(size)]
        self.count = 0

    def _hash(self, key):
        """Compute the hash index for a given key."""
        if isinstance(key, int):
            return key % self.size
        # For string keys, use Python's built-in hash
        return hash(key) % self.size

    def insert(self, key, value):
        """Insert or update a key-value pair."""
        index = self._hash(key)

        # Check if key already exists, then update it
        for i, (k, v) in enumerate(self.table[index]):
            if k == key:
                self.table[index][i] = (key, value)
                return

        # If key doesn't exist, append new pair
        self.table[index].append((key, value))
        self.count += 1

    def search(self, key):
        """Search for a key and return its value, or None if not found."""
        index = self._hash(key)

        for k, v in self.table[index]:
            if k == key:
                return v

        return None

    def delete(self, key):
        """Delete a key-value pair. Returns the value if found, None otherwise."""
        index = self._hash(key)

        for i, (k, v) in enumerate(self.table[index]):
            if k == key:
                self.table[index].pop(i)
                self.count -= 1
                return v

        return None

    def load_factor(self):
        """Return the current load factor."""
        return self.count / self.size

    def display(self):
        """Display the hash table."""
        print(f"Hash Table (size={self.size}, entries={self.count}, "
              f"load_factor={self.load_factor():.2f}):")
        for i in range(self.size):
            if self.table[i]:
                pairs = ", ".join(f"{k}: {v}" for k, v in self.table[i])
                print(f"  [{i}] {pairs}")
            else:
                print(f"  [{i}] (empty)")

Using the Hash Table

ht = HashTable(size=7)

# Insert key-value pairs
ht.insert("name", "Alice")
ht.insert("age", 30)
ht.insert("city", "New York")
ht.insert("job", "Engineer")
ht.insert("hobby", "Reading")

ht.display()

# Search
print(f"\nSearch 'name': {ht.search('name')}")
print(f"Search 'age': {ht.search('age')}")
print(f"Search 'missing': {ht.search('missing')}")

# Update existing key
ht.insert("age", 31)
print(f"\nAfter update, 'age': {ht.search('age')}")

# Delete
deleted = ht.delete("city")
print(f"Deleted 'city': {deleted}")
# We display the hash table again
ht.display()

Output:

Hash Table (size=7, entries=5, load_factor=0.71):
  [0] name: Alice
  [1] city: New York, job: Engineer, hobby: Reading
  [2] age: 30
  [3] (empty)
  [4] (empty)
  [5] (empty)
  [6] (empty)

Search 'name': Alice
Search 'age': 30
Search 'missing': None

After update, 'age': 31
Deleted 'city': New York
Hash Table (size=7, entries=4, load_factor=0.57):
  [0] name: Alice
  [1] job: Engineer, hobby: Reading
  [2] age: 31
  [3] (empty)
  [4] (empty)
  [5] (empty)
  [6] (empty)

Understanding Collisions

Let's demonstrate collisions with integer keys:

ht = HashTable(size=5)

# Keys 3, 8, 13, 18 all hash to index 3 (key % 5 = 3)
ht.insert(3, "Three")
ht.insert(8, "Eight")
ht.insert(13, "Thirteen")
ht.insert(18, "Eighteen")

# Key 5 hashes to index 0
ht.insert(5, "Five")

ht.display()

# All four colliding keys are searchable
print(f"\nSearch key 3: {ht.search(3)}")
print(f"Search key 13: {ht.search(13)}")
print(f"Search key 18: {ht.search(18)}")

Output:

Hash Table (size=5, entries=5, load_factor=1.00):
  [0] 5: Five
  [1] (empty)
  [2] (empty)
  [3] 3: Three, 8: Eight, 13: Thirteen, 18: Eighteen
  [4] (empty)

Search key 3: Three
Search key 13: Thirteen
Search key 18: Eighteen

Index 3 has a chain of four entries, which are all found correctly despite the collisions.

Performance Analysis

Operation	Average Case	Worst Case (all keys collide)
Insert	O(1)	O(n)
Search	O(1)	O(n)
Delete	O(1)	O(n)

The performance depends on the load factor (n / m):

Load factor < 1:   Most buckets have 0 or 1 entries, resulting in O(1) operations
Load factor ≈ 1:   Some chains form, yet it's still close to O(1)
Load factor >> 1:   Long chains degrade toward O(n)

tip

Keep the load factor below 0.75 for optimal performance. When it exceeds this threshold, resize the hash table (typically doubling the size) and rehash all entries.

Adding Dynamic Resizing

To maintain performance as entries grow, add automatic resizing:

class DynamicHashTable(HashTable):
    """Hash table that resizes automatically when load factor exceeds threshold."""

    LOAD_THRESHOLD = 0.75

    def insert(self, key, value):
        """Insert with automatic resizing."""
        super().insert(key, value)

        if self.load_factor() > self.LOAD_THRESHOLD:
            self._resize()

    def _resize(self):
        """Double the table size and rehash all entries."""
        old_table = self.table
        self.size *= 2
        self.table = [[] for _ in range(self.size)]
        self.count = 0

        for bucket in old_table:
            for key, value in bucket:
                self.insert(key, value)


# Usage
ht = DynamicHashTable(size=4)
for i in range(10):
    ht.insert(f"key_{i}", f"value_{i}")
    print(f"Inserted key_{i}: size={ht.size}, load={ht.load_factor():.2f}")

Output:

Inserted key_0: size=4, load=0.25
Inserted key_1: size=4, load=0.50
Inserted key_2: size=4, load=0.75
Inserted key_3: size=8, load=0.50
Inserted key_4: size=8, load=0.62
Inserted key_5: size=8, load=0.75
Inserted key_6: size=16, load=0.44
Inserted key_7: size=16, load=0.50
Inserted key_8: size=16, load=0.56
Inserted key_9: size=16, load=0.62

The table automatically doubles when the load factor exceeds 0.75.

Common Mistake: Using a Mutable Default for Buckets

A critical error is creating the hash table with a shared list:

# WRONG: all buckets share the same list object
hash_table = [[]] * 10
hash_table[0].append("test")
print(hash_table)  # "test" appears in ALL buckets!

Output:

[['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test'], ['test']]

The correct approach:

# CORRECT: each bucket is an independent list
hash_table = [[] for _ in range(10)]
hash_table[0].append("test")
print(hash_table[0])  # ['test']
print(hash_table[1])  # []

danger

Never use [[]] * n to create a hash table. This creates n references to the same list, so modifying one bucket modifies all of them. Always use [[] for _ in range(n)] to create independent lists.

Quick Reference

Operation	Code	Complexity
Create table	`[[] for _ in range(size)]`	O(m)
Hash function	`key % size`	O(1)
Insert	Append `(key, value)` to chain	O(1) average
Search	Linear scan through chain	O(1) average
Delete	Find and remove from chain	O(1) average
Load factor	`entries / size`	O(1)

Hashing with chaining is the foundation behind Python's built-in dict type (which uses open addressing, a related technique). Understanding this implementation gives you insight into how dictionaries achieve their remarkable O(1) average-case performance for lookups, insertions, and deletions.

How Hashing with Chaining Works​

Key Concepts​

Basic Implementation​

Simple Hash Table with Chaining​

Complete Hash Table Class with Key-Value Pairs​

Using the Hash Table​

Understanding Collisions​

Performance Analysis​

Adding Dynamic Resizing​

Common Mistake: Using a Mutable Default for Buckets​

Quick Reference​

Table of Contents

How Hashing with Chaining Works

Key Concepts

Basic Implementation

Simple Hash Table with Chaining

Complete Hash Table Class with Key-Value Pairs

Using the Hash Table

Understanding Collisions

Performance Analysis

Adding Dynamic Resizing

Common Mistake: Using a Mutable Default for Buckets

Quick Reference