Skip to main content

How to Find Uncommon Characters Between Two Strings in Python

Identifying characters that appear in one string but not the other is a useful operation in data validation, text comparison, diff algorithms, and input checking. Python offers multiple approaches to solve this problem, from high-performance set operations that find unique characters to list comprehensions that preserve the original sequence.

In this guide, you will learn how to find uncommon characters using set operations, preserve character order when it matters, handle duplicates and case sensitivity, and build a reusable comparison function.

Using Set Symmetric Difference for Speed

When you need only the unique uncommon characters regardless of order, the symmetric difference operator (^) provides the fastest solution. It returns elements that exist in either set but not in both:

string1 = "apple"
string2 = "pear"

uncommon = set(string1) ^ set(string2)

print(uncommon) # {'l', 'r'}
print("".join(sorted(uncommon))) # lr

The characters l and r each appear in only one of the two strings. All shared characters (a, p, e) are excluded from the result.

You can also use the explicit method name for improved readability:

uncommon = set(string1).symmetric_difference(set(string2))
Set Results Are Unordered

Sets in Python do not maintain insertion order. Each execution may display characters in a different sequence. Use sorted() when you need consistent, reproducible output.

Preserving Original Order with List Comprehension

When the character sequence matters, such as when displaying differences in context or maintaining positional relationships, use a list comprehension with set-based lookups:

string1 = "AACDB"
string2 = "GAFD"

# Find characters common to both strings
common_chars = set(string1) & set(string2)

# Extract uncommon characters while preserving their original order
uncommon_from_s1 = [char for char in string1 if char not in common_chars]
uncommon_from_s2 = [char for char in string2 if char not in common_chars]

result = uncommon_from_s1 + uncommon_from_s2

print("".join(result))

Output:

CBGF

The common characters are A and D (present in both strings). From string1, the uncommon characters C and B appear in their original order. From string2, the uncommon characters G and F are appended in their original order.

Performance Optimization

Always convert the comparison target to a set before checking membership. Set lookups are O(1), while checking membership in a string or list is O(n). For long strings, this optimization can improve performance by orders of magnitude:

# Slow: O(n) lookup for each character
uncommon = [c for c in string1 if c not in string2]

# Fast: O(1) lookup for each character
chars_in_s2 = set(string2)
uncommon = [c for c in string1 if c not in chars_in_s2]

Keeping Duplicate Occurrences

The set-based approach eliminates duplicates because sets only store unique elements. If you need to preserve every occurrence of uncommon characters, use list comprehensions with set lookups:

string1 = "aabbcc"
string2 = "bbd"

chars_in_s2 = set(string2)
chars_in_s1 = set(string1)

uncommon_s1 = [c for c in string1 if c not in chars_in_s2]
uncommon_s2 = [c for c in string2 if c not in chars_in_s1]

print("From string1:", "".join(uncommon_s1))
print("From string2:", "".join(uncommon_s2))

Output:

From string1: aacc
From string2: d

Both occurrences of a and both occurrences of c are preserved because they do not appear in string2 at all. The b characters from string1 are excluded because b exists in string2.

Handling Case Sensitivity

By default, Python treats uppercase and lowercase letters as distinct characters. To perform a case-insensitive comparison, normalize both strings before comparing:

string1 = "Hello"
string2 = "WORLD"

# Case-sensitive: 'H', 'e' from s1 and 'W', 'O', 'R', 'D' from s2 are uncommon
print(set(string1) ^ set(string2))

# Case-insensitive: only characters not shared at all
uncommon = set(string1.casefold()) ^ set(string2.casefold())
print(sorted(uncommon))

Output:

{'H', 'W', 'R', 'l', 'O', 'D', 'L', 'e', 'o'}
['d', 'e', 'h', 'r', 'w']
Why .casefold() Over .lower()

The .casefold() method handles international characters more aggressively than .lower(). For example, the German "ß" is converted to "ss" by .casefold(), while .lower() leaves it unchanged. For English-only text both methods behave identically, but .casefold() is the safer default.

Reusable Comparison Function

Here is a versatile function that handles the most common comparison scenarios:

def find_uncommon_chars(s1, s2, preserve_order=True, case_sensitive=True):
"""
Find characters that appear in one string but not the other.

Args:
s1, s2: Strings to compare.
preserve_order: Keep original character sequence.
case_sensitive: Whether 'A' and 'a' are treated as different.

Returns:
String of uncommon characters.
"""
if not case_sensitive:
s1, s2 = s1.casefold(), s2.casefold()

if not preserve_order:
return "".join(sorted(set(s1) ^ set(s2)))

common = set(s1) & set(s2)
result = [c for c in s1 if c not in common] + \
[c for c in s2 if c not in common]

return "".join(result)

print(find_uncommon_chars("Hello", "World"))
print(find_uncommon_chars("Hello", "World", preserve_order=False))
print(find_uncommon_chars("ABC", "abc", case_sensitive=False))

Output:

HeWrd
HWder

note

The third call returns an empty string because all characters match when case is ignored.

Practical Applications

Finding Missing Characters

required = "abcdefghijklmnopqrstuvwxyz0123456789"
password = "mypassword123"

missing = sorted(set(required) - set(password.lower()))
print(f"Unused characters: {''.join(missing)}")

Output:

Unused characters: 0456789bcefghijklnqtuvxz

Detecting Typos Between Versions

original = "algorithm"
typed = "algoritm"

common = set(original) & set(typed)
differences = [c for c in original if c not in common]
print(f"Missing characters: {''.join(differences)}")

Output:

Missing characters: h

Method Comparison

ApproachPreserves OrderHandles DuplicatesPerformance
set(s1) ^ set(s2)NoRemoves duplicatesFastest
List comprehension with set lookupYesPreserves all occurrencesFast
Nested loops (avoid)YesPreserves all occurrencesSlow

Conclusion

  • Use set(s1) ^ set(s2) when you need the fastest result and do not care about character order or duplicates.
  • Switch to list comprehensions with set-based lookups when you need to preserve the original character order or keep duplicate occurrences.
  • Always convert comparison targets to sets before membership checks to maintain O(1) lookup performance.
  • For case-insensitive comparisons, normalize both strings with .casefold() before performing any comparison.