Skip to main content

How to Calculate Cumulative Sum (Running Total) in Python

A cumulative sum (or running total) is a sequence where each element is the sum of all preceding elements up to that point. For an array [1, 2, 3], the cumulative sum is [1, 3, 6]. This calculation is fundamental in data science, financial analysis (stock returns), and signal processing.

This guide explores the most efficient ways to calculate cumulative sums in Python, ranging from the standard library to high-performance numerical computing.

Understanding Cumulative Sum

Mathematically, for a sequence A, the cumulative sum C at index n is: C[n] = A[0] + A[1] + ... + A[n]

Visual Representation:

Original:   [1,  2,  3,  4,  5]
↓ ↓ ↓ ↓ ↓
Cumulative: [1, 3, 6, 10, 15]
(1) (1+2) (1+2+3)...

Method 1: Using itertools.accumulate (Standard Library)

For pure Python (without external libraries like NumPy), the most efficient and "Pythonic" method is itertools.accumulate. It returns an iterator, which saves memory.

from itertools import accumulate

data = [1, 2, 3, 4, 5]

# ✅ Efficient: Returns an iterator
running_total_iter = accumulate(data)

# Convert to list to view results
result = list(running_total_iter)

print(f"Original: {data}")
print(f"Cumulative: {result}")

Output:

Original: [1, 2, 3, 4, 5]
Cumulative: [1, 3, 6, 10, 15]
tip

itertools.accumulate is versatile. By default, it sums, but you can pass other functions (like max for a running maximum or operator.mul for a running product).

Method 2: Using NumPy cumsum (High Performance)

When working with large datasets (thousands or millions of records), NumPy is the industry standard. The np.cumsum() function is implemented in C and is significantly faster than Python loops.

import numpy as np

# Define a NumPy array
arr = np.array([10, 20, 30, 40, 50])

# ✅ Efficient: Vectorized cumulative sum
numpy_result = np.cumsum(arr)

print(f"NumPy Result: {numpy_result}")
print(f"Type: {type(numpy_result)}")

Output:

NumPy Result: [ 10  30  60 100 150]
Type: <class 'numpy.ndarray'>
note

If you are doing Financial Analysis (e.g., calculating cumulative stock returns), NumPy or Pandas (which relies on NumPy) is the preferred method due to its speed and convenient array operations.

Method 3: Iterative Approach (Educational)

If you cannot import modules or want to understand the logic, you can write a manual loop. This is O(N) time complexity (linear), which is efficient enough for small scripts.

def manual_cumsum(arr):
cumsum = []
total = 0
for num in arr:
total += num
cumsum.append(total)
return cumsum

data = [1, 2, 3, 4, 5]
result = manual_cumsum(data)

print(f"Manual Loop: {result}")

Output:

Manual Loop: [1, 3, 6, 10, 15]

Common Pitfall: Inefficient List Comprehension

A common mistake is trying to force a one-liner using list comprehension and slicing. While concise, this approach is extremely inefficient for large lists.

Problem: using sum(arr[:i+1]) inside a loop recalculates the sum from scratch for every element. This creates a time complexity of O(N^2) (Quadratic).

data = [1, 2, 3, 4, 5]

# ⛔️ Inefficient: Slicing and re-summing repeatedly
# DO NOT USE for large datasets
bad_performance = [sum(data[:i+1]) for i in range(len(data))]

print(bad_performance)

Output:

[1, 3, 6, 10, 15]
warning

Avoid the slicing method [sum(arr[:i+1])...] for data processing. As the list grows, the execution time grows exponentially. Use itertools or numpy instead.

Conclusion

To calculate cumulative sums in Python:

  1. Use numpy.cumsum(arr) for large numeric datasets, scientific computing, or data science tasks.
  2. Use itertools.accumulate(arr) for standard Python scripts where you want efficient, memory-safe iterators without external dependencies.
  3. Use a for loop if you need simple, custom logic and want to avoid imports.
  4. Avoid slicing within list comprehensions due to poor performance.