Skip to main content

How to Count Character Frequency in Python

Counting how often each character appears in a string is a fundamental task in text analysis, data processing, and coding interviews. Python offers multiple approaches, from optimized standard library tools to manual implementations.

In this guide, you will learn the professional O(n) solutions for counting character frequency, understand why certain common approaches should be avoided, and see practical applications like anagram detection and histogram generation.

The Counter class from Python's standard library is specifically designed for counting hashable objects. It is the fastest and most readable solution:

from collections import Counter

text = "mississippi"

freq = Counter(text)

print(freq)

Output:

Counter({'i': 4, 's': 4, 'p': 2, 'm': 1})

Counter scans the string exactly once, building a complete frequency map in O(n) time.

Accessing Counts​

Counter behaves like a dictionary but returns 0 for missing keys instead of raising a KeyError:

from collections import Counter

text = "hello world"
freq = Counter(text)

print(f"Count of 'l': {freq['l']}")
print(f"Count of 'z': {freq['z']}")

Output:

Count of 'l': 3
Count of 'z': 0

Finding the Most Common Characters​

The most_common() method returns characters sorted by frequency in descending order:

from collections import Counter

text = "the quick brown fox jumps over the lazy dog"
freq = Counter(text)

for char, count in freq.most_common(5):
display = repr(char) if char == ' ' else char
print(f" {display}: {count}")

Output:

  ' ': 8
o: 4
e: 3
t: 2
h: 2

Using a Standard Dictionary​

When you cannot import modules or prefer explicit control over the counting logic, use a dictionary with the .get() method:

text = "banana"
freq = {}

for char in text:
freq[char] = freq.get(char, 0) + 1

print(freq)

Output:

{'b': 1, 'a': 3, 'n': 2}

The .get(char, 0) call returns the current count if the key exists, or 0 if it does not, avoiding a KeyError on the first occurrence of each character.

Using defaultdict​

The defaultdict from collections eliminates the need for .get() by automatically initializing missing keys:

from collections import defaultdict

text = "banana"
freq = defaultdict(int)

for char in text:
freq[char] += 1

print(dict(freq))

Output:

{'b': 1, 'a': 3, 'n': 2}

Filtering by Character Type​

You can count only specific types of characters by adding conditions to the input:

from collections import Counter

text = "Hello, World! 123"

# Letters only (case-insensitive)
letters = Counter(c.lower() for c in text if c.isalpha())
print(f"Letters: {letters}")

# Digits only
digits = Counter(c for c in text if c.isdigit())
print(f"Digits: {digits}")

Output:

Letters: Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, 'w': 1, 'r': 1, 'd': 1})
Digits: Counter({'1': 1, '2': 1, '3': 1})

Counting Words Instead of Characters​

The same techniques apply to word frequency by splitting the string first:

from collections import Counter

text = "the cat sat on the mat the cat was happy"

words = text.lower().split()
word_freq = Counter(words)

print(word_freq.most_common(3))

Output:

[('the', 3), ('cat', 2), ('sat', 1)]

The Beginner Trap: .count() in a Loop​

A common but inefficient approach uses str.count() inside a comprehension:

# SLOW: O(n * k) complexity
text = "mississippi"
freq = {char: text.count(char) for char in set(text)}
print(freq)
# Output: {'i': 4, 'm': 1, 'p': 2, 's': 4}

While this produces correct results, it scans the entire string once for every unique character.

Performance Problem

text.count(char) scans the entire string for each unique character. With k unique characters in a string of length n, the total work is O(n * k), effectively O(n²) in the worst case. Counter scans the string exactly once for O(n) total.

Performance Comparison​

import time
from collections import Counter

text = "abcdefghij" * 100_000 # 1 million characters

# Counter approach: single pass
start = time.perf_counter()
freq1 = Counter(text)
counter_time = time.perf_counter() - start

# .count() approach: multiple passes
start = time.perf_counter()
freq2 = {char: text.count(char) for char in set(text)}
count_time = time.perf_counter() - start

print(f"Counter: {counter_time:.4f}s")
print(f".count(): {count_time:.4f}s")
print(f"Speedup: {count_time / counter_time:.1f}x faster with Counter")

Typical output:

Counter:   0.0234s
.count(): 0.1876s
Speedup: 8.0x faster with Counter
note

The gap widens further as the number of unique characters increases.

Practical Applications​

Finding Duplicate Characters​

from collections import Counter

def find_duplicates(text: str) -> list:
"""Return characters that appear more than once."""
freq = Counter(text.lower())
return sorted([char for char, count in freq.items() if count > 1])

print(find_duplicates("programming"))

Output:

['g', 'm', 'r']

Checking for Anagrams​

Two strings are anagrams if they contain exactly the same characters with the same frequencies. Counter makes this check a one-liner:

from collections import Counter

def are_anagrams(s1: str, s2: str) -> bool:
"""Check if two strings are anagrams of each other."""
clean1 = s1.lower().replace(" ", "")
clean2 = s2.lower().replace(" ", "")
return Counter(clean1) == Counter(clean2)

print(are_anagrams("listen", "silent"))
print(are_anagrams("hello", "world"))

Output:

True
False

Character Frequency Histogram​

from collections import Counter

def print_histogram(text: str, top_n: int = 10):
"""Print a simple histogram of character frequencies."""
freq = Counter(c.lower() for c in text if c.isalpha())

for char, count in freq.most_common(top_n):
bar = "ā–ˆ" * count
print(f" {char}: {bar} ({count})")

print_histogram("The quick brown fox jumps over the lazy dog")

Output:

  o: ā–ˆā–ˆā–ˆā–ˆ (4)
e: ā–ˆā–ˆā–ˆ (3)
t: ā–ˆā–ˆ (2)
h: ā–ˆā–ˆ (2)
u: ā–ˆā–ˆ (2)
r: ā–ˆā–ˆ (2)
q: ā–ˆ (1)
i: ā–ˆ (1)
c: ā–ˆ (1)
k: ā–ˆ (1)

Method Comparison​

MethodTime ComplexityMemoryBest For
Counter(text)O(n)O(k)Production code (recommended)
Dict + .get()O(n)O(k)No-import environments
defaultdict(int)O(n)O(k)Frequent increments with extra logic
.count() in a loopO(n * k)O(k)Avoid

n = string length, k = number of unique characters

Conclusion​

For counting character frequency in Python, collections.Counter is the clear best choice. It scans the string in a single pass, provides convenient methods like most_common(), and handles missing keys gracefully by returning 0. When imports are not available, a standard dictionary with .get(key, 0) achieves the same O(n) performance. Always avoid using .count() inside a loop, as it rescans the entire string for each unique character, resulting in significantly worse performance that grows with both string length and character variety.