How to Count Character Frequency in Python
Counting how often each character appears in a string is a fundamental task in text analysis, data processing, and coding interviews. Python offers multiple approaches, from optimized standard library tools to manual implementations.
In this guide, you will learn the professional O(n) solutions for counting character frequency, understand why certain common approaches should be avoided, and see practical applications like anagram detection and histogram generation.
Using collections.Counter (Recommended)ā
The Counter class from Python's standard library is specifically designed for counting hashable objects. It is the fastest and most readable solution:
from collections import Counter
text = "mississippi"
freq = Counter(text)
print(freq)
Output:
Counter({'i': 4, 's': 4, 'p': 2, 'm': 1})
Counter scans the string exactly once, building a complete frequency map in O(n) time.
Accessing Countsā
Counter behaves like a dictionary but returns 0 for missing keys instead of raising a KeyError:
from collections import Counter
text = "hello world"
freq = Counter(text)
print(f"Count of 'l': {freq['l']}")
print(f"Count of 'z': {freq['z']}")
Output:
Count of 'l': 3
Count of 'z': 0
Finding the Most Common Charactersā
The most_common() method returns characters sorted by frequency in descending order:
from collections import Counter
text = "the quick brown fox jumps over the lazy dog"
freq = Counter(text)
for char, count in freq.most_common(5):
display = repr(char) if char == ' ' else char
print(f" {display}: {count}")
Output:
' ': 8
o: 4
e: 3
t: 2
h: 2
Using a Standard Dictionaryā
When you cannot import modules or prefer explicit control over the counting logic, use a dictionary with the .get() method:
text = "banana"
freq = {}
for char in text:
freq[char] = freq.get(char, 0) + 1
print(freq)
Output:
{'b': 1, 'a': 3, 'n': 2}
The .get(char, 0) call returns the current count if the key exists, or 0 if it does not, avoiding a KeyError on the first occurrence of each character.
Using defaultdictā
The defaultdict from collections eliminates the need for .get() by automatically initializing missing keys:
from collections import defaultdict
text = "banana"
freq = defaultdict(int)
for char in text:
freq[char] += 1
print(dict(freq))
Output:
{'b': 1, 'a': 3, 'n': 2}
Filtering by Character Typeā
You can count only specific types of characters by adding conditions to the input:
from collections import Counter
text = "Hello, World! 123"
# Letters only (case-insensitive)
letters = Counter(c.lower() for c in text if c.isalpha())
print(f"Letters: {letters}")
# Digits only
digits = Counter(c for c in text if c.isdigit())
print(f"Digits: {digits}")
Output:
Letters: Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, 'w': 1, 'r': 1, 'd': 1})
Digits: Counter({'1': 1, '2': 1, '3': 1})
Counting Words Instead of Charactersā
The same techniques apply to word frequency by splitting the string first:
from collections import Counter
text = "the cat sat on the mat the cat was happy"
words = text.lower().split()
word_freq = Counter(words)
print(word_freq.most_common(3))
Output:
[('the', 3), ('cat', 2), ('sat', 1)]
The Beginner Trap: .count() in a Loopā
A common but inefficient approach uses str.count() inside a comprehension:
# SLOW: O(n * k) complexity
text = "mississippi"
freq = {char: text.count(char) for char in set(text)}
print(freq)
# Output: {'i': 4, 'm': 1, 'p': 2, 's': 4}
While this produces correct results, it scans the entire string once for every unique character.
text.count(char) scans the entire string for each unique character. With k unique characters in a string of length n, the total work is O(n * k), effectively O(n²) in the worst case. Counter scans the string exactly once for O(n) total.
Performance Comparisonā
import time
from collections import Counter
text = "abcdefghij" * 100_000 # 1 million characters
# Counter approach: single pass
start = time.perf_counter()
freq1 = Counter(text)
counter_time = time.perf_counter() - start
# .count() approach: multiple passes
start = time.perf_counter()
freq2 = {char: text.count(char) for char in set(text)}
count_time = time.perf_counter() - start
print(f"Counter: {counter_time:.4f}s")
print(f".count(): {count_time:.4f}s")
print(f"Speedup: {count_time / counter_time:.1f}x faster with Counter")
Typical output:
Counter: 0.0234s
.count(): 0.1876s
Speedup: 8.0x faster with Counter
The gap widens further as the number of unique characters increases.
Practical Applicationsā
Finding Duplicate Charactersā
from collections import Counter
def find_duplicates(text: str) -> list:
"""Return characters that appear more than once."""
freq = Counter(text.lower())
return sorted([char for char, count in freq.items() if count > 1])
print(find_duplicates("programming"))
Output:
['g', 'm', 'r']
Checking for Anagramsā
Two strings are anagrams if they contain exactly the same characters with the same frequencies. Counter makes this check a one-liner:
from collections import Counter
def are_anagrams(s1: str, s2: str) -> bool:
"""Check if two strings are anagrams of each other."""
clean1 = s1.lower().replace(" ", "")
clean2 = s2.lower().replace(" ", "")
return Counter(clean1) == Counter(clean2)
print(are_anagrams("listen", "silent"))
print(are_anagrams("hello", "world"))
Output:
True
False
Character Frequency Histogramā
from collections import Counter
def print_histogram(text: str, top_n: int = 10):
"""Print a simple histogram of character frequencies."""
freq = Counter(c.lower() for c in text if c.isalpha())
for char, count in freq.most_common(top_n):
bar = "ā" * count
print(f" {char}: {bar} ({count})")
print_histogram("The quick brown fox jumps over the lazy dog")
Output:
o: āāāā (4)
e: āāā (3)
t: āā (2)
h: āā (2)
u: āā (2)
r: āā (2)
q: ā (1)
i: ā (1)
c: ā (1)
k: ā (1)
Method Comparisonā
| Method | Time Complexity | Memory | Best For |
|---|---|---|---|
Counter(text) | O(n) | O(k) | Production code (recommended) |
Dict + .get() | O(n) | O(k) | No-import environments |
defaultdict(int) | O(n) | O(k) | Frequent increments with extra logic |
.count() in a loop | O(n * k) | O(k) | Avoid |
n = string length, k = number of unique characters
Conclusionā
For counting character frequency in Python, collections.Counter is the clear best choice. It scans the string in a single pass, provides convenient methods like most_common(), and handles missing keys gracefully by returning 0. When imports are not available, a standard dictionary with .get(key, 0) achieves the same O(n) performance. Always avoid using .count() inside a loop, as it rescans the entire string for each unique character, resulting in significantly worse performance that grows with both string length and character variety.