How to Implement a Hash Map in Python

A hash map (also called a hash table) is a data structure that stores key-value pairs and provides fast access, insertion, and deletion (typically in O(1) average time). Hash maps are one of the most widely used data structures in programming, powering everything from database indexing to caching systems.

Python comes with a built-in hash map implementation called dict (dictionary). However, understanding how hash maps work internally, including hashing and collision handling, is invaluable for technical interviews, algorithm design, and performance optimization.

This guide covers both using Python's built-in dictionary and building a hash map from scratch.

How Hash Maps Work

A hash map uses a hash function to convert keys into array indices, allowing direct access to values without searching through all entries.

Core Concepts

Hash Function: This converts a key into a numeric index. Python provides the built-in hash() function for this.
Bucket Array: This is an internal array where each position (bucket) can store one or more key-value pairs.
Collision Handling: When two different keys hash to the same index, a strategy is needed to store both. The most common approaches are:
- Chaining: Each bucket holds a list of key-value pairs.
- Open Addressing: This probes for the next available slot.

Key: "apple"
    → hash("apple") = 1234567890
    → 1234567890 % table_size = 2
    → Store ("apple", value) in bucket[2]

Using Python's Built-in Dictionary

Python's dict is a highly optimized hash map implementation. For most use cases, it's all you need:

# Create a hash map
scores = {}

# Insert key-value pairs
scores['Alice'] = 95
scores['Bob'] = 82
scores['Charlie'] = 91

# Access a value
print(scores['Alice'])

# Update a value
scores['Alice'] = 98
print(scores['Alice'])

# Delete a key
del scores['Bob']

# Check if a key exists
print('Charlie' in scores)

# Iterate over key-value pairs
for name, score in scores.items():
    print(f"{name}: {score}")

Output:

95
98
True
Alice: 98
Charlie: 91

tip

For most practical Python programming, use dict. It's implemented in C, handles collision resolution automatically, dynamically resizes, and maintains insertion order (Python 3.7+). Building a custom hash map is primarily useful for learning and interview preparation.

Building a Hash Map from Scratch

To understand the internal mechanics, let's implement a hash map with chaining for collision handling.

Step 1: Define the Class Structure

class HashMap:
    def __init__(self, size=10):
        """Initialize the hash map with a fixed number of buckets."""
        self.size = size
        self.buckets = [[] for _ in range(size)]
        self.count = 0

self.size: The number of buckets in the hash table.
self.buckets: A list of empty lists, where each list is a bucket that can hold multiple key-value pairs (chaining).
self.count: Tracks the number of stored key-value pairs.

Step 2: Hash Function

    def _hash(self, key):
        """Compute the bucket index for a given key."""
        return hash(key) % self.size

Python's built-in hash() generates an integer from the key. The modulo operation (% self.size) maps it to a valid bucket index.

Step 3: Insert or Update

    def put(self, key, value):
        """Insert a key-value pair, or update the value if the key exists."""
        index = self._hash(key)
        bucket = self.buckets[index]

        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket[i] = (key, value)  # Update existing
                return

        bucket.append((key, value))  # Insert new
        self.count += 1

The method first checks if the key already exists in the bucket. If found, it updates the value. Otherwise, it appends the new pair.

Step 4: Retrieve a Value

    def get(self, key):
        """Retrieve the value for a given key, or None if not found."""
        index = self._hash(key)
        bucket = self.buckets[index]

        for k, v in bucket:
            if k == key:
                return v
        return None

Step 5: Delete a Key

    def remove(self, key):
        """Remove a key-value pair. Returns True if found, False otherwise."""
        index = self._hash(key)
        bucket = self.buckets[index]

        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket.pop(i)
                self.count -= 1
                return True
        return False

Step 6: Display and Utility Methods

    def __len__(self):
        return self.count

    def __contains__(self, key):
        return self.get(key) is not None

    def __str__(self):
        pairs = []
        for bucket in self.buckets:
            for key, value in bucket:
                pairs.append(f"{key!r}: {value!r}")
        return "{" + ", ".join(pairs) + "}"

Complete Implementation

class HashMap:
    def __init__(self, size=10):
        self.size = size
        self.buckets = [[] for _ in range(size)]
        self.count = 0

    def _hash(self, key):
        return hash(key) % self.size

    def put(self, key, value):
        index = self._hash(key)
        bucket = self.buckets[index]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket[i] = (key, value)
                return
        bucket.append((key, value))
        self.count += 1

    def get(self, key):
        index = self._hash(key)
        bucket = self.buckets[index]
        for k, v in bucket:
            if k == key:
                return v
        return None

    def remove(self, key):
        index = self._hash(key)
        bucket = self.buckets[index]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket.pop(i)
                self.count -= 1
                return True
        return False

    def __len__(self):
        return self.count

    def __contains__(self, key):
        return self.get(key) is not None

    def __str__(self):
        pairs = []
        for bucket in self.buckets:
            for key, value in bucket:
                pairs.append(f"{key!r}: {value!r}")
        return "{" + ", ".join(pairs) + "}"

Usage Example

hm = HashMap(size=5)

# Insert key-value pairs
hm.put('apple', 10)
hm.put('banana', 20)
hm.put('cherry', 30)
print("After insertions:", hm)

# Retrieve values
print("Value for 'banana':", hm.get('banana'))
print("Value for 'apple':", hm.get('apple'))

# Update a value
hm.put('apple', 50)
print("After update:", hm)

# Delete a key
hm.remove('banana')
print("After deletion:", hm)

# Check existence
print("'cherry' exists:", 'cherry' in hm)
print("'banana' exists:", 'banana' in hm)
print("Size:", len(hm))

Output:

After insertions: {'cherry': 30, 'apple': 10, 'banana': 20}
Value for 'banana': 20
Value for 'apple': 10
After update: {'cherry': 30, 'apple': 50, 'banana': 20}
After deletion: {'cherry': 30, 'apple': 50}
'cherry' exists: True
'banana' exists: False
Size: 2

Understanding Collisions

Collisions occur when two different keys hash to the same bucket index. Our implementation handles this with chaining, storing multiple pairs in the same bucket as a list:

hm = HashMap(size=3)  # Only 3 buckets- collisions are likely

hm.put('a', 1)
hm.put('b', 2)
hm.put('c', 3)
hm.put('d', 4)
hm.put('e', 5)

# Visualize the buckets
for i, bucket in enumerate(hm.buckets):
    print(f"Bucket {i}: {bucket}")

Output (example):

Bucket 0: [('a', 1), ('b', 2)]
Bucket 1: [('c', 3)]
Bucket 2: [('d', 4), ('e', 5)]

Multiple pairs share the same bucket, but lookups still work correctly by scanning the bucket's list.

Why bucket size matters

With too few buckets, many collisions occur and performance degrades toward O(n). With too many buckets, memory is wasted. A good hash map maintains a load factor (items/buckets) below 0.7 and resizes (doubles the buckets) when it exceeds this threshold.

Time Complexity

Operation	Average Case	Worst Case (many collisions)
Insert (`put`)	O(1)	O(n)
Lookup (`get`)	O(1)	O(n)
Delete (`remove`)	O(1)	O(n)
Space	O(n)	O(n)

The worst case occurs when all keys hash to the same bucket, turning the hash map into essentially a linked list. In practice, with a good hash function and appropriate load factor, operations are nearly always O(1).

Common Applications

Application	How Hash Maps Help
Caching	Map URLs or queries to cached responses
Counting frequencies	Count occurrences of words, characters, etc.
Database indexing	Fast lookup of records by key
Deduplication	Track seen elements in O(1)
Two Sum / pattern matching	Store complements or patterns for O(n) algorithms

Conclusion

Hash maps are one of the most fundamental and powerful data structures in computer science.

Python's built-in dict is an excellent hash map implementation for everyday programming. It is fast, well-tested, and feature-rich.
Building a hash map from scratch deepens your understanding of hashing, collision resolution, and the trade-offs between bucket count, load factor, and performance.

Whether you're using Python's dictionary or implementing your own, mastering hash maps is essential for writing efficient, scalable code.

How Hash Maps Work​

Core Concepts​

Using Python's Built-in Dictionary​

Building a Hash Map from Scratch​

Step 1: Define the Class Structure​

Step 2: Hash Function​

Step 3: Insert or Update​

Step 4: Retrieve a Value​

Step 5: Delete a Key​

Step 6: Display and Utility Methods​

Complete Implementation​

Usage Example​

Understanding Collisions​

Time Complexity​

Common Applications​

Conclusion​

Table of Contents