How to Calculate the Average of a List of Floats in Python

Finding the arithmetic mean (average) is one of the most common operations in programming. Python offers several approaches depending on your needs: built-in functions for simplicity, the statistics module for correctness, and NumPy for performance.

Using Built-in `sum()` and `len()`

For standard lists, combining sum() and len() is the most straightforward native approach.

prices = [19.99, 5.50, 4.25, 10.00]

average = sum(prices) / len(prices)

print(f"Average Price: ${average:.2f}")

Output:

Average Price: $9.93

Empty List Protection

Dividing by zero raises a ZeroDivisionError. Always validate your input:

def calculate_average(numbers: list[float]) -> float | None:
    if not numbers:
        return None
    return sum(numbers) / len(numbers)


result = calculate_average([])
print(result)  # None

Using a Conditional Expression

prices = [19.99, 5.50, 4.25, 10.00]

average = sum(prices) / len(prices) if prices else 0.0

print(f"Average: {average:.2f}") # Output: Average: 9.93

Using the `statistics` Module

Python's standard library includes a statistics module specifically designed for statistical calculations. It handles edge cases and provides clear, readable code.

import statistics

data = [1.5, 2.5, 3.5, 4.5, 5.5]

average = statistics.mean(data)

print(f"Mean: {average}")

Output:

Mean: 3.5

Additional Statistical Functions

The statistics module offers more than just the mean:

import statistics

data = [2.5, 3.0, 3.5, 4.0, 4.5, 100.0]  # Note the outlier

print(f"Mean:   {statistics.mean(data):.2f}")
print(f"Median: {statistics.median(data):.2f}")
print(f"Stdev:  {statistics.stdev(data):.2f}")

Output:

Mean:   19.58
Median: 3.75
Stdev:  39.40

When to Use Median

When your data contains outliers (like 100.0 above), the median often represents the "typical" value better than the mean.

Using NumPy for Large Datasets

For lists with millions of elements, standard Python becomes slow. NumPy is written in C and optimized for numerical operations.

import numpy as np

data = np.array([1.1, 2.2, 3.3, 4.4, 5.5])

average = np.mean(data)

print(f"Average: {average}")

Output:

Average: 3.3

Working with Large Datasets

import numpy as np

# Generate 1 million random values
large_dataset = np.random.uniform(0, 100, size=1_000_000)

average = np.mean(large_dataset)
print(f"Average of 1M values: {average:.4f}")

Output:

Average of 1M values: 49.9872

NumPy with Multi-dimensional Data

import numpy as np

# Sales data: rows = months, columns = products
sales = np.array([
    [100.5, 200.3, 150.2],
    [110.2, 190.5, 160.8],
    [105.8, 210.1, 155.5]
])

# Average across all values
total_avg = np.mean(sales)

# Average per product (column)
product_avg = np.mean(sales, axis=0)

# Average per month (row)
monthly_avg = np.mean(sales, axis=1)

print(f"Overall average: {total_avg:.2f}")
print(f"Per product: {product_avg}")
print(f"Per month: {monthly_avg}")

Output:

Overall average: 153.77
Per product: [105.5 200.3 155.5]
Per month: [150.33333333 153.83333333 157.13333333]

Performance Comparison

import time
import statistics
import numpy as np

# Create test data
data_list = [float(i) for i in range(1_000_000)]
data_array = np.array(data_list)


def benchmark(name, func):
    start = time.perf_counter()
    result = func()
    elapsed = time.perf_counter() - start
    print(f"{name}: {elapsed:.4f}s (result: {result:.2f})")


benchmark("sum/len", lambda: sum(data_list) / len(data_list))
benchmark("statistics.mean", lambda: statistics.mean(data_list))
benchmark("numpy.mean", lambda: np.mean(data_array))

Typical Output:

sum/len: 0.0080s (result: 499999.50)
statistics.mean: 0.3006s (result: 499999.50)
numpy.mean: 0.0007s (result: 499999.50)

Handling Special Cases

Filtering Before Averaging

import statistics

scores = [85.5, 90.0, None, 78.5, None, 92.0]

# Filter out None values
valid_scores = [s for s in scores if s is not None]

if valid_scores:
    average = statistics.mean(valid_scores)
    print(f"Average score: {average:.1f}")

Output:

Average score: 86.5

Weighted Average

def weighted_average(values: list[float], weights: list[float]) -> float:
    """Calculate weighted average."""
    if len(values) != len(weights):
        raise ValueError("Values and weights must have same length")

    total = sum(v * w for v, w in zip(values, weights))
    return total / sum(weights)


grades = [85.0, 90.0, 78.0]
weights = [0.3, 0.5, 0.2]  # 30%, 50%, 20%

result = weighted_average(grades, weights)
print(f"Weighted average: {result:.1f}")

Output:

Weighted average: 86.1

Method Comparison

Method	Speed	Dependencies	Best For
`sum()/len()`	Fast	None	Simple scripts, small data
`statistics.mean()`	Moderate	None (stdlib)	Readability, additional stats
`numpy.mean()`	Ultra Fast	NumPy	Large datasets, data science

Summary

Use sum()/len() for simple cases with small lists: just remember to check for empty lists.
Use statistics.mean() when readability matters or you need additional statistical functions.
Use numpy.mean() for large datasets or when already working in a data science context.

Using Built-in sum() and len()​

Using a Conditional Expression​

Using the statistics Module​

Additional Statistical Functions​

Using NumPy for Large Datasets​

Working with Large Datasets​

NumPy with Multi-dimensional Data​

Performance Comparison​

Handling Special Cases​

Filtering Before Averaging​

Weighted Average​

Method Comparison​

Summary​

Table of Contents

Using Built-in `sum()` and `len()`

Using a Conditional Expression

Using the `statistics` Module

Additional Statistical Functions

Using NumPy for Large Datasets

Working with Large Datasets

NumPy with Multi-dimensional Data

Performance Comparison

Handling Special Cases

Filtering Before Averaging

Weighted Average

Method Comparison

Summary