Skip to main content

How to Find the Longest Sequence in a Collection in Python

Finding the longest sequence in a collection of data is a classic algorithmic problem. "Longest sequence" can refer to multiple things: the longest consecutive run of numbers (e.g., 1, 2, 3, 4 in [10, 1, 2, 3, 4, 20]), the longest increasing subsequence, or simply the longest repeating element.

This guide explores the most common interpretation: Finding the longest consecutive sequence in an unsorted array, and also covers finding the longest increasing subsequence.

Scenario 1: Longest Consecutive Sequence (Unsorted)

Problem: Given [100, 4, 200, 1, 3, 2], the longest consecutive elements sequence is [1, 2, 3, 4]. The length is 4.

Efficient Solution (O(N)): Use a set for instant lookups. Iterate through the set, and for every number x that is the start of a sequence (meaning x-1 is not in the set), count upwards.

def longest_consecutive(nums):
if not nums:
return 0

num_set = set(nums)
longest_streak = 0

for num in num_set:
# Only start counting if 'num' is the start of a sequence
if num - 1 not in num_set:
current_num = num
current_streak = 1

# Count upwards
while current_num + 1 in num_set:
current_num += 1
current_streak += 1

longest_streak = max(longest_streak, current_streak)

return longest_streak

# Example
data = [100, 4, 200, 1, 3, 2]
print(f"Longest consecutive length: {longest_consecutive(data)}")

Output:

Longest consecutive length: 4

Scenario 2: Longest Increasing Subsequence

Problem: Given [10, 9, 2, 5, 3, 7, 101, 18], the longest increasing subsequence is [2, 3, 7, 101] (or [2, 5, 7, 18], etc).

Dynamic Programming Solution (O(N^2)): For each element, find the longest subsequence ending at that element by checking all previous elements.

def length_of_lis(nums):
if not nums:
return 0

# dp[i] stores the length of the longest increasing subsequence ending at index i
dp = [1] * len(nums)

for i in range(len(nums)):
for j in range(i):
if nums[i] > nums[j]:
dp[i] = max(dp[i], dp[j] + 1)

return max(dp)

data = [10, 9, 2, 5, 3, 7, 101, 18]
print(f"Longest Increasing Subsequence Length: {length_of_lis(data)}")

Output:

Longest Increasing Subsequence Length: 4
tip

There is a more optimized O(N log N) solution using binary search (bisect module) for extremely large lists.

Scenario 3: Longest Run of Identical Elements

Problem: Given [1, 1, 0, 1, 1, 1, 0], finding the longest run of 1s (length 3). This is common in signal processing.

Solution: itertools.groupby is perfect for this.

from itertools import groupby

data = [1, 1, 0, 1, 1, 1, 0, 1]

# Find the max length of consecutive identical items
max_len = 0
max_val = None

for key, group in groupby(data):
# Convert iterator to list to get length
length = len(list(group))
if length > max_len:
max_len = length
max_val = key

print(f"Longest run is {max_len} consecutive '{max_val}'s")

Output:

Longest run is 3 consecutive '1's

Conclusion

To find the longest sequence in Python:

  1. Consecutive Integers (Unsorted): Use a set and iterate logic (if num-1 not in set).
  2. Increasing Subsequence: Use Dynamic Programming (dp array).
  3. Consecutive Identical Items: Use itertools.groupby().