How to Compare Strings Case-Insensitively in Python
Comparing strings like "Apple" and "apple" should typically return True in user-facing applications. Python provides multiple methods for case-insensitive comparison, but not all handle international text correctly.
The Standard: .casefold()
Since Python 3.3, .casefold() is the correct method for case-insensitive ("caseless") string matching.
s1 = "Hello World"
s2 = "hello world"
if s1.casefold() == s2.casefold():
print("Strings match!")
Output:
Strings match!
Why Not .lower()?
The .lower() method works for ASCII characters (English), but fails with certain Unicode characters. The German letter ß (sharp S) is equivalent to ss, but .lower() doesn't handle this:
s1 = "der Fluß" # German for "the river"
s2 = "DER FLUSS"
# .lower() fails - returns False
print(s1.lower() == s2.lower()) # False
print(s1.lower()) # 'der fluß'
print(s2.lower()) # 'der fluss'
# .casefold() works correctly - returns True
print(s1.casefold() == s2.casefold()) # True
print(s1.casefold()) # 'der fluss'
print(s2.casefold()) # 'der fluss'
Output:
False
der fluß
der fluss
True
der fluss
der fluss
.casefold() performs Unicode case folding, which is more aggressive than simple lowercasing. It converts characters to their "folded" form for comparison, handling special cases like:
ß→ssfi(ligature) →fi- Various accented characters
Practical Examples
Email Comparison
Email addresses should be compared case-insensitively:
def emails_match(email1: str, email2: str) -> bool:
"""Compare emails case-insensitively."""
return email1.casefold() == email2.casefold()
stored_email = "User@Example.COM"
login_email = "user@example.com"
if emails_match(stored_email, login_email):
print("Welcome back!")
else:
print("Email not found.")
Username Validation
Prevent duplicate usernames that differ only by case:
def username_exists(new_username: str, existing_users: list) -> bool:
"""Check if username exists (case-insensitive)."""
new_folded = new_username.casefold()
return any(user.casefold() == new_folded for user in existing_users)
users = ["Admin", "JohnDoe", "jane_smith"]
print(username_exists("admin", users)) # True
print(username_exists("JOHNDOE", users)) # True
print(username_exists("newuser", users)) # False
Finding Duplicates in a List
Detect duplicates when ignoring case:
def find_case_duplicates(items: list) -> list:
"""Find items that are duplicates when case is ignored."""
seen = {}
duplicates = []
for item in items:
folded = item.casefold()
if folded in seen:
duplicates.append((seen[folded], item))
else:
seen[folded] = item
return duplicates
usernames = ["Admin", "admin", "User", "USER", "Guest"]
dupes = find_case_duplicates(usernames)
for original, duplicate in dupes:
print(f"'{duplicate}' duplicates '{original}'")
Output:
'admin' duplicates 'Admin'
'USER' duplicates 'User'
Case-Insensitive Dictionary
Create a dictionary that treats keys case-insensitively:
class CaseInsensitiveDict(dict):
"""Dictionary with case-insensitive string keys."""
def __setitem__(self, key, value):
super().__setitem__(key.casefold(), value)
def __getitem__(self, key):
return super().__getitem__(key.casefold())
def __contains__(self, key):
return super().__contains__(key.casefold())
def get(self, key, default=None):
return super().get(key.casefold(), default)
headers = CaseInsensitiveDict()
headers["Content-Type"] = "application/json"
print(headers["content-type"]) # application/json
print(headers["CONTENT-TYPE"]) # application/json
print("Content-TYPE" in headers) # True
Case-Insensitive Sorting
Sort a list alphabetically without regard to case:
words = ["Apple", "banana", "Cherry", "apricot"]
# Default sort (uppercase comes before lowercase)
print(sorted(words))
# Case-insensitive sort
print(sorted(words, key=str.casefold))
Output:
['Apple', 'Cherry', 'apricot', 'banana']
['Apple', 'apricot', 'banana', 'Cherry']
Case-Insensitive Search
Check if a substring exists regardless of case:
def contains_insensitive(text: str, search: str) -> bool:
"""Check if text contains search string (case-insensitive)."""
return search.casefold() in text.casefold()
document = "The Quick Brown Fox Jumps Over The Lazy Dog"
print(contains_insensitive(document, "quick")) # True
print(contains_insensitive(document, "LAZY")) # True
print(contains_insensitive(document, "cat")) # False
Finding All Matches
import re
def find_all_insensitive(text: str, pattern: str) -> list:
"""Find all occurrences of pattern (case-insensitive)."""
return re.findall(pattern, text, re.IGNORECASE)
text = "Python is great. PYTHON is powerful. python is fun."
matches = find_all_insensitive(text, "python")
print(matches) # ['Python', 'PYTHON', 'python']
Method Comparison
| Method | Unicode Support | Example | Recommendation |
|---|---|---|---|
.casefold() | Full | ß → ss | ✅ Always use |
.lower() | ASCII only | A → a | ⚠️ Legacy only |
.upper() | ASCII only | a → A | ❌ Avoid for comparison |
Performance Note
Both .lower() and .casefold() have similar performance for ASCII text. The difference only matters for Unicode characters:
import timeit
text = "Hello World" * 1000
lower_time = timeit.timeit(lambda: text.lower(), number=10000)
casefold_time = timeit.timeit(lambda: text.casefold(), number=10000)
print(f".lower(): {lower_time:.4f}s") # .lower(): 0.1449s
print(f".casefold(): {casefold_time:.4f}s") # .casefold(): 0.3974s
The performance difference is negligible for most applications.
Summary
- Use
.casefold()for all case-insensitive comparisons. It handles Unicode correctly. - Avoid
.lower()for comparisons. It fails with certain international characters. - Use
key=str.casefoldfor case-insensitive sorting. - Use
re.IGNORECASEfor case-insensitive regex operations.