Skip to main content

How to Implement Counting Sort Algorithm in Python

Counting Sort is a linear-time sorting algorithm with O(n + k) time complexity, where n is the number of elements and k is the range of input values. Unlike comparison-based algorithms like QuickSort or MergeSort, Counting Sort works by counting the occurrences of each distinct value and using those counts to determine the correct position of each element in the sorted output. This makes it exceptionally fast when the range of values is small relative to the number of elements.

In this guide, you will learn how Counting Sort works, implement both standard and stable versions, explore a Pythonic approach using Counter, and understand when this algorithm is the right choice.

How Counting Sort Works

The algorithm operates in three phases:

  1. Count frequencies: scan the input and record how many times each value appears.
  2. Calculate cumulative positions: transform the count array so each entry indicates where elements should be placed in the output (needed for stable sorting).
  3. Reconstruct the sorted array: place each element into its correct position based on the counts.

Standard Implementation

This implementation handles negative numbers by offsetting values so the minimum maps to index 0:

def counting_sort(arr):
if not arr:
return []

min_val = min(arr)
max_val = max(arr)
range_size = max_val - min_val + 1

# Count occurrences of each value
count = [0] * range_size
for num in arr:
count[num - min_val] += 1

# Reconstruct sorted array from counts
result = []
for i in range(range_size):
value = i + min_val
result.extend([value] * count[i])

return result

data = [4, 2, 2, 8, 3, 3, 1]
print(counting_sort(data))

Output:

[1, 2, 2, 3, 3, 4, 8]

It also works correctly with negative numbers:

data = [3, -1, 0, -5, 2, -1, 4]
print(counting_sort(data))

Output:

[-5, -1, -1, 0, 2, 3, 4]

The min_val offset ensures that a value like -5 maps to index 0 in the count array, while 4 maps to index 9.

Stable Counting Sort

The standard version above reconstructs the result from counts alone, which does not preserve the relative order of equal elements. A stable version maintains the original order of elements with the same value, which is important when sorting objects by a key field:

def stable_counting_sort(arr):
if not arr:
return []

min_val = min(arr)
max_val = max(arr)
range_size = max_val - min_val + 1

# Count frequencies
count = [0] * range_size
for num in arr:
count[num - min_val] += 1

# Convert to cumulative counts (each entry indicates the end position)
for i in range(1, range_size):
count[i] += count[i - 1]

# Build output array (traverse input in reverse for stability)
output = [0] * len(arr)
for num in reversed(arr):
index = count[num - min_val] - 1
output[index] = num
count[num - min_val] -= 1

return output

print(stable_counting_sort([4, 2, 2, 8, 3, 3, 1]))

Output:

[1, 2, 2, 3, 3, 4, 8]
Why Stability Matters

Stability is essential when sorting objects with multiple attributes. For example, if you sort a list of students by grade and two students share the same grade, a stable sort preserves their original relative order. This property also makes Counting Sort suitable as a subroutine in Radix Sort.

Pythonic Approach with Counter

Using collections.Counter provides a clean and readable implementation:

from collections import Counter

def counting_sort_pythonic(arr):
if not arr:
return []

counts = Counter(arr)

result = []
for value in range(min(arr), max(arr) + 1):
result.extend([value] * counts[value])

return result

print(counting_sort_pythonic([4, 2, 2, 8, 3, 3, 1]))

Output:

[1, 2, 2, 3, 3, 4, 8]

Counter returns 0 for missing keys, so values that do not appear in the input are naturally skipped without any special handling.

Sparse Data Optimization

When the range of values is large but the number of unique values is small, iterating over the entire range wastes time. Sorting only the unique keys avoids this:

from collections import Counter

def counting_sort_sparse(arr):
if not arr:
return []

counts = Counter(arr)
return [val for val in sorted(counts) for _ in range(counts[val])]

# Large range but only 4 unique values
data = [1000000, 1, 500000, 1, 1000000, 500000]
print(counting_sort_sparse(data))

Output:

[1, 1, 500000, 500000, 1000000, 1000000]
tip

The sparse version uses sorted() on the unique keys, which is O(k log k) where k is the number of unique values. This is far more efficient than iterating over a range of a million when there are only a few distinct values.

In-Place Variant for Known Ranges

When memory is constrained and the value range is known in advance, you can sort the array in place without allocating a separate output array:

def counting_sort_inplace(arr, max_val):
count = [0] * (max_val + 1)

for num in arr:
count[num] += 1

index = 0
for value in range(max_val + 1):
for _ in range(count[value]):
arr[index] = value
index += 1

return arr

data = [4, 2, 2, 8, 3, 3, 1]
print(counting_sort_inplace(data, 10))

Output:

[1, 2, 2, 3, 3, 4, 8]

This variant modifies the original list directly and does not handle negative numbers. It is useful for specific scenarios like sorting exam scores (0 to 100) or pixel values (0 to 255).

When to Use Counting Sort

ScenarioRecommendation
Ages (0 to 120)Excellent choice
Exam scores (0 to 100)Excellent choice
ASCII characters (0 to 255)Good choice
Large integer rangesUse comparison-based sort instead
Floating-point numbersNot applicable
Strings or complex objectsUse Python's built-in Timsort
warning

Counting Sort becomes impractical when the range k is much larger than n. For a list like [1, 1000000000], the algorithm would allocate a billion-element count array just to sort two numbers. In such cases, comparison-based algorithms like Timsort (Python's built-in sorted()) are far more appropriate.

Complexity Analysis

MetricValue
Time (best, average, worst)O(n + k)
SpaceO(k) for counts + O(n) for output
StableYes (with cumulative count implementation)
In-placePossible, but loses stability

Comparison with Python's Built-in Sort

import timeit

def counting_sort(arr):
... # as defined in examples above

data = list(range(100)) * 1000 # 100,000 integers in range 0-99

counting_time = timeit.timeit(
lambda: counting_sort(data.copy()),
number=100
)

builtin_time = timeit.timeit(
lambda: sorted(data.copy()),
number=100
)

print(f"Counting sort: {counting_time:.4f}s")
print(f"Built-in sort: {builtin_time:.4f}s")

Example output (times vary by system):

Counting sort: 0.8523s
Built-in sort: 0.4217s
note

Python's built-in sorted() uses Timsort, which is implemented in highly optimized C code. For general-purpose sorting, it is usually faster in practice even though Counting Sort has better theoretical complexity for small ranges. Counting Sort's advantage is most pronounced when implemented in C or when the range k is very small relative to n.

Conclusion

Counting Sort is a powerful algorithm for sorting integers within a known, limited range.

  • Use the standard implementation for simple integer sorting
  • Use the stable version when you need to preserve the relative order of equal elements
  • Use the Pythonic Counter approach for clean, readable code.
  • The sparse variant handles cases where the range is large but the number of unique values is small.

For general-purpose sorting in Python, the built-in sorted() remains the better default choice due to its optimized C implementation and adaptability to any data type.