Skip to main content

How to Compare Strings Case-Insensitively in Python

Comparing strings like "Apple" and "apple" should typically return True in user-facing applications. Python provides multiple methods for case-insensitive comparison, but not all handle international text correctly.

The Standard: .casefold()

Since Python 3.3, .casefold() is the correct method for case-insensitive ("caseless") string matching.

s1 = "Hello World"
s2 = "hello world"

if s1.casefold() == s2.casefold():
print("Strings match!")

Output:

Strings match!

Why Not .lower()?

The .lower() method works for ASCII characters (English), but fails with certain Unicode characters. The German letter ß (sharp S) is equivalent to ss, but .lower() doesn't handle this:

s1 = "der Fluß"   # German for "the river"
s2 = "DER FLUSS"

# .lower() fails - returns False
print(s1.lower() == s2.lower()) # False
print(s1.lower()) # 'der fluß'
print(s2.lower()) # 'der fluss'

# .casefold() works correctly - returns True
print(s1.casefold() == s2.casefold()) # True
print(s1.casefold()) # 'der fluss'
print(s2.casefold()) # 'der fluss'

Output:

False
der fluß
der fluss
True
der fluss
der fluss
Unicode Folding

.casefold() performs Unicode case folding, which is more aggressive than simple lowercasing. It converts characters to their "folded" form for comparison, handling special cases like:

  • ßss
  • (ligature) → fi
  • Various accented characters

Practical Examples

Email Comparison

Email addresses should be compared case-insensitively:

def emails_match(email1: str, email2: str) -> bool:
"""Compare emails case-insensitively."""
return email1.casefold() == email2.casefold()


stored_email = "User@Example.COM"
login_email = "user@example.com"

if emails_match(stored_email, login_email):
print("Welcome back!")
else:
print("Email not found.")

Username Validation

Prevent duplicate usernames that differ only by case:

def username_exists(new_username: str, existing_users: list) -> bool:
"""Check if username exists (case-insensitive)."""
new_folded = new_username.casefold()
return any(user.casefold() == new_folded for user in existing_users)


users = ["Admin", "JohnDoe", "jane_smith"]

print(username_exists("admin", users)) # True
print(username_exists("JOHNDOE", users)) # True
print(username_exists("newuser", users)) # False

Finding Duplicates in a List

Detect duplicates when ignoring case:

def find_case_duplicates(items: list) -> list:
"""Find items that are duplicates when case is ignored."""
seen = {}
duplicates = []

for item in items:
folded = item.casefold()
if folded in seen:
duplicates.append((seen[folded], item))
else:
seen[folded] = item

return duplicates


usernames = ["Admin", "admin", "User", "USER", "Guest"]
dupes = find_case_duplicates(usernames)

for original, duplicate in dupes:
print(f"'{duplicate}' duplicates '{original}'")

Output:

'admin' duplicates 'Admin'
'USER' duplicates 'User'

Case-Insensitive Dictionary

Create a dictionary that treats keys case-insensitively:

class CaseInsensitiveDict(dict):
"""Dictionary with case-insensitive string keys."""

def __setitem__(self, key, value):
super().__setitem__(key.casefold(), value)

def __getitem__(self, key):
return super().__getitem__(key.casefold())

def __contains__(self, key):
return super().__contains__(key.casefold())

def get(self, key, default=None):
return super().get(key.casefold(), default)


headers = CaseInsensitiveDict()
headers["Content-Type"] = "application/json"

print(headers["content-type"]) # application/json
print(headers["CONTENT-TYPE"]) # application/json
print("Content-TYPE" in headers) # True

Case-Insensitive Sorting

Sort a list alphabetically without regard to case:

words = ["Apple", "banana", "Cherry", "apricot"]

# Default sort (uppercase comes before lowercase)
print(sorted(words))

# Case-insensitive sort
print(sorted(words, key=str.casefold))

Output:

['Apple', 'Cherry', 'apricot', 'banana']
['Apple', 'apricot', 'banana', 'Cherry']

Check if a substring exists regardless of case:

def contains_insensitive(text: str, search: str) -> bool:
"""Check if text contains search string (case-insensitive)."""
return search.casefold() in text.casefold()


document = "The Quick Brown Fox Jumps Over The Lazy Dog"

print(contains_insensitive(document, "quick")) # True
print(contains_insensitive(document, "LAZY")) # True
print(contains_insensitive(document, "cat")) # False

Finding All Matches

import re


def find_all_insensitive(text: str, pattern: str) -> list:
"""Find all occurrences of pattern (case-insensitive)."""
return re.findall(pattern, text, re.IGNORECASE)


text = "Python is great. PYTHON is powerful. python is fun."
matches = find_all_insensitive(text, "python")

print(matches) # ['Python', 'PYTHON', 'python']

Method Comparison

MethodUnicode SupportExampleRecommendation
.casefold()Fullßss✅ Always use
.lower()ASCII onlyAa⚠️ Legacy only
.upper()ASCII onlyaA❌ Avoid for comparison

Performance Note

Both .lower() and .casefold() have similar performance for ASCII text. The difference only matters for Unicode characters:

import timeit

text = "Hello World" * 1000

lower_time = timeit.timeit(lambda: text.lower(), number=10000)
casefold_time = timeit.timeit(lambda: text.casefold(), number=10000)

print(f".lower(): {lower_time:.4f}s") # .lower(): 0.1449s
print(f".casefold(): {casefold_time:.4f}s") # .casefold(): 0.3974s
note

The performance difference is negligible for most applications.

Summary

  • Use .casefold() for all case-insensitive comparisons. It handles Unicode correctly.
  • Avoid .lower() for comparisons. It fails with certain international characters.
  • Use key=str.casefold for case-insensitive sorting.
  • Use re.IGNORECASE for case-insensitive regex operations.