Skip to main content

How to Get All Substrings of a String in Python

A substring is any contiguous sequence of characters within a string. Extracting all possible substrings is a fundamental operation used in text searching, pattern matching, DNA sequence analysis, and string algorithm problems. For example, the string "abc" has the substrings: "a", "ab", "abc", "b", "bc", and "c".

In this guide, you'll learn multiple methods to generate all substrings of a given string in Python, along with explanations of how each approach works and when to use it.

Understanding Substring Generation​

For a string of length n, substrings are defined by a start index i and an end index j where 0 ≤ i < j ≤ n. The total number of non-empty substrings is n Ɨ (n + 1) / 2.

For the string "abc" (length 3):

Start (i)End (j)Substring
01"a"
02"ab"
03"abc"
12"b"
13"bc"
23"c"

Total: 3 Ɨ 4 / 2 = 6 substrings.

List comprehension provides the most concise and Pythonic way to generate all substrings using nested iteration over start and end indices:

s = "abc"

substrings = [s[i:j] for i in range(len(s)) for j in range(i + 1, len(s) + 1)]

print(substrings)

Output:

['a', 'ab', 'abc', 'b', 'bc', 'c']

How it works:

  1. The outer loop i represents the starting index of each substring (from 0 to len(s) - 1).
  2. The inner loop j represents the ending index (from i + 1 to len(s)), ensuring every substring has at least one character.
  3. s[i:j] extracts the substring from index i up to (but not including) index j.
Why this is the best approach

List comprehension is concise, idiomatic Python, and performs the same operations as nested loops with less boilerplate. It's the go-to method for most substring generation tasks.

Using Nested for Loops​

If you prefer explicit control over the iteration or need to add extra logic (like filtering or transforming substrings), nested loops provide maximum clarity:

s = "abc"

substrings = []
for i in range(len(s)):
for j in range(i + 1, len(s) + 1):
substrings.append(s[i:j])

print(substrings)

Output:

['a', 'ab', 'abc', 'b', 'bc', 'c']

This is functionally identical to the list comprehension approach but makes it easy to add conditions or side effects inside the loop:

s = "abc"

substrings = []
for i in range(len(s)):
for j in range(i + 1, len(s) + 1):
sub = s[i:j]
if len(sub) >= 2: # Only keep substrings with 2+ characters
substrings.append(sub)

print(substrings)

Output:

['ab', 'abc', 'bc']

Using itertools.combinations()​

The itertools.combinations() function generates all pairs of indices, which naturally correspond to substring boundaries:

from itertools import combinations

s = "abc"

substrings = [s[i:j] for i, j in combinations(range(len(s) + 1), 2)]

print(substrings)

Output:

['a', 'ab', 'abc', 'b', 'bc', 'c']

How it works:

  1. range(len(s) + 1) produces indices [0, 1, 2, 3].
  2. combinations(..., 2) generates all pairs where i < j: (0,1), (0,2), (0,3), (1,2), (1,3), (2,3).
  3. Each pair (i, j) maps directly to the substring s[i:j].

This approach is elegant and clearly expresses the mathematical relationship between index pairs and substrings.

Grouping Substrings by Length​

Sometimes you need substrings organized by their length rather than in a flat list. This is useful for pattern analysis or when processing substrings of a specific size:

s = "code"

grouped = [
[s[i:i + length] for i in range(len(s) - length + 1)]
for length in range(1, len(s) + 1)
]

for group in grouped:
print(group)

Output:

['c', 'o', 'd', 'e']
['co', 'od', 'de']
['cod', 'ode']
['code']

Getting Unique Substrings​

If your string contains repeated characters, you may get duplicate substrings. Use a set to keep only unique ones:

s = "abab"

all_substrings = [s[i:j] for i in range(len(s)) for j in range(i + 1, len(s) + 1)]
unique_substrings = sorted(set(all_substrings))

print(f"All substrings ({len(all_substrings)}): {all_substrings}")
print(f"Unique substrings ({len(unique_substrings)}): {unique_substrings}")

Output:

All substrings (10):    ['a', 'ab', 'aba', 'abab', 'b', 'ba', 'bab', 'a', 'ab', 'b']
Unique substrings (7): ['a', 'ab', 'aba', 'abab', 'b', 'ba', 'bab']

Common Mistake: Off-by-One Errors in Range Boundaries​

A frequent error is using incorrect range boundaries, resulting in missing substrings or including empty strings.

Wrong approach: missing full-length substring.

s = "abc"

# Using range(len(s)) instead of range(len(s) + 1) for j
substrings = [s[i:j] for i in range(len(s)) for j in range(i + 1, len(s))]

print(substrings)

Output:

['a', 'ab', 'b']

The substrings "abc", "bc", and "c" are missing because j never reaches len(s).

Correct approach: use len(s) + 1.

s = "abc"

substrings = [s[i:j] for i in range(len(s)) for j in range(i + 1, len(s) + 1)]

print(substrings)

Output:

['a', 'ab', 'abc', 'b', 'bc', 'c']
Remember the slicing rule

Python's s[i:j] extracts characters from index i up to but not including index j. To include the last character of the string, j must be able to reach len(s), which means the range must go up to len(s) + 1.

Using a Generator for Memory Efficiency​

For long strings, storing all substrings in a list can consume significant memory. A generator yields substrings one at a time:

def all_substrings(s):
"""Yield all non-empty substrings of the given string."""
for i in range(len(s)):
for j in range(i + 1, len(s) + 1):
yield s[i:j]


s = "Python"

# Process substrings without storing all in memory
count = 0
for sub in all_substrings(s):
count += 1

print(f"Total substrings of '{s}': {count}")

Output:

Total substrings of 'Python': 21
When to use generators

For a string of length n, there are n(n+1)/2 substrings. A string of length 1,000 produces 500,500 substrings. If you're only processing them (for example, searching or counting), a generator avoids storing them all in memory simultaneously.

Creating a Reusable Function​

A complete, production-ready function with options for filtering and uniqueness:

def get_substrings(
s: str,
min_length: int = 1,
max_length: int | None = None,
unique: bool = False,
) -> list[str]:
"""Generate all substrings of a string with optional filtering.

Args:
s: The input string.
min_length: Minimum substring length (default: 1).
max_length: Maximum substring length (default: length of s).
unique: If True, return only unique substrings.

Returns:
A list of substrings matching the criteria.
"""
if max_length is None:
max_length = len(s)

substrings = [
s[i:j]
for i in range(len(s))
for j in range(i + 1, len(s) + 1)
if min_length <= j - i <= max_length
]

if unique:
# Preserve order while removing duplicates
seen = set()
substrings = [x for x in substrings if not (x in seen or seen.add(x))]

return substrings


# All substrings
print(get_substrings("abc"))

# Only substrings of length 2
print(get_substrings("abc", min_length=2, max_length=2))

# Unique substrings only
print(get_substrings("abab", unique=True))

Output:

['a', 'ab', 'abc', 'b', 'bc', 'c']
['ab', 'bc']
['a', 'ab', 'aba', 'abab', 'b', 'ba', 'bab']

Performance Considerations​

String LengthTotal SubstringsNotes
1055Instant
1005,050Fast
1,000500,500Use generators if possible
10,00050,005,000Memory-intensive; generators recommended

All methods have O(n²) time complexity since generating all substrings requires visiting every valid (i, j) pair. Space complexity is O(n²) when storing results in a list, or O(1) when using a generator.

Quick Comparison of Methods​

MethodConcisenessFlexibilityBest For
List comprehension⭐⭐⭐ High⭐⭐ MediumMost use cases (recommended)
Nested for loops⭐⭐ Medium⭐⭐⭐ HighAdding extra logic during iteration
itertools.combinations()⭐⭐⭐ High⭐⭐ MediumMathematically expressive code
Generator function⭐⭐ Medium⭐⭐⭐ HighLarge strings, memory efficiency

Conclusion​

Generating all substrings of a string in Python is straightforward with the right approach:

  • List comprehension is the most Pythonic and recommended method for most scenarios.
  • Nested loops offer maximum flexibility for adding filtering or transformation logic.
  • itertools.combinations() elegantly maps index pairs to substrings.
  • Generators are essential for large strings where memory efficiency matters.
  • Always use range(i + 1, len(s) + 1) for the end index to avoid missing substrings: remember that Python slicing excludes the upper bound.
  • Use set() to eliminate duplicates when the string contains repeated characters.