Skip to main content

How to Calculate the Average of a List of Floats in Python

Finding the arithmetic mean (average) is one of the most common operations in programming. Python offers several approaches depending on your needs: built-in functions for simplicity, the statistics module for correctness, and NumPy for performance.

Using Built-in sum() and len()

For standard lists, combining sum() and len() is the most straightforward native approach.

prices = [19.99, 5.50, 4.25, 10.00]

average = sum(prices) / len(prices)

print(f"Average Price: ${average:.2f}")

Output:

Average Price: $9.93
Empty List Protection

Dividing by zero raises a ZeroDivisionError. Always validate your input:

def calculate_average(numbers: list[float]) -> float | None:
if not numbers:
return None
return sum(numbers) / len(numbers)


result = calculate_average([])
print(result) # None

Using a Conditional Expression

prices = [19.99, 5.50, 4.25, 10.00]

average = sum(prices) / len(prices) if prices else 0.0

print(f"Average: {average:.2f}") # Output: Average: 9.93

Using the statistics Module

Python's standard library includes a statistics module specifically designed for statistical calculations. It handles edge cases and provides clear, readable code.

import statistics

data = [1.5, 2.5, 3.5, 4.5, 5.5]

average = statistics.mean(data)

print(f"Mean: {average}")

Output:

Mean: 3.5

Additional Statistical Functions

The statistics module offers more than just the mean:

import statistics

data = [2.5, 3.0, 3.5, 4.0, 4.5, 100.0] # Note the outlier

print(f"Mean: {statistics.mean(data):.2f}")
print(f"Median: {statistics.median(data):.2f}")
print(f"Stdev: {statistics.stdev(data):.2f}")

Output:

Mean:   19.58
Median: 3.75
Stdev: 39.40
When to Use Median

When your data contains outliers (like 100.0 above), the median often represents the "typical" value better than the mean.

Using NumPy for Large Datasets

For lists with millions of elements, standard Python becomes slow. NumPy is written in C and optimized for numerical operations.

import numpy as np

data = np.array([1.1, 2.2, 3.3, 4.4, 5.5])

average = np.mean(data)

print(f"Average: {average}")

Output:

Average: 3.3

Working with Large Datasets

import numpy as np

# Generate 1 million random values
large_dataset = np.random.uniform(0, 100, size=1_000_000)

average = np.mean(large_dataset)
print(f"Average of 1M values: {average:.4f}")

Output:

Average of 1M values: 49.9872

NumPy with Multi-dimensional Data

import numpy as np

# Sales data: rows = months, columns = products
sales = np.array([
[100.5, 200.3, 150.2],
[110.2, 190.5, 160.8],
[105.8, 210.1, 155.5]
])

# Average across all values
total_avg = np.mean(sales)

# Average per product (column)
product_avg = np.mean(sales, axis=0)

# Average per month (row)
monthly_avg = np.mean(sales, axis=1)

print(f"Overall average: {total_avg:.2f}")
print(f"Per product: {product_avg}")
print(f"Per month: {monthly_avg}")

Output:

Overall average: 153.77
Per product: [105.5 200.3 155.5]
Per month: [150.33333333 153.83333333 157.13333333]

Performance Comparison

import time
import statistics
import numpy as np

# Create test data
data_list = [float(i) for i in range(1_000_000)]
data_array = np.array(data_list)


def benchmark(name, func):
start = time.perf_counter()
result = func()
elapsed = time.perf_counter() - start
print(f"{name}: {elapsed:.4f}s (result: {result:.2f})")


benchmark("sum/len", lambda: sum(data_list) / len(data_list))
benchmark("statistics.mean", lambda: statistics.mean(data_list))
benchmark("numpy.mean", lambda: np.mean(data_array))

Typical Output:

sum/len: 0.0080s (result: 499999.50)
statistics.mean: 0.3006s (result: 499999.50)
numpy.mean: 0.0007s (result: 499999.50)

Handling Special Cases

Filtering Before Averaging

import statistics

scores = [85.5, 90.0, None, 78.5, None, 92.0]

# Filter out None values
valid_scores = [s for s in scores if s is not None]

if valid_scores:
average = statistics.mean(valid_scores)
print(f"Average score: {average:.1f}")

Output:

Average score: 86.5

Weighted Average

def weighted_average(values: list[float], weights: list[float]) -> float:
"""Calculate weighted average."""
if len(values) != len(weights):
raise ValueError("Values and weights must have same length")

total = sum(v * w for v, w in zip(values, weights))
return total / sum(weights)


grades = [85.0, 90.0, 78.0]
weights = [0.3, 0.5, 0.2] # 30%, 50%, 20%

result = weighted_average(grades, weights)
print(f"Weighted average: {result:.1f}")

Output:

Weighted average: 86.1

Method Comparison

MethodSpeedDependenciesBest For
sum()/len()FastNoneSimple scripts, small data
statistics.mean()ModerateNone (stdlib)Readability, additional stats
numpy.mean()Ultra FastNumPyLarge datasets, data science

Summary

  • Use sum()/len() for simple cases with small lists: just remember to check for empty lists.
  • Use statistics.mean() when readability matters or you need additional statistical functions.
  • Use numpy.mean() for large datasets or when already working in a data science context.