How to Compute Weighted Averages in Python
A weighted average is a statistical measure where different values in a dataset carry different levels of importance, or weights. Unlike a simple average where every value is equal, a weighted average is essential in scenarios like calculating GPA (where a 4-credit course counts more than a 1-credit course) or financial portfolio returns.
The mathematical formula is: Weighted Average = Sum(Value × Weight) / Sum(Weights)
This guide explains how to implement this in Python using pure logic and the highly efficient NumPy library.
Method 1: Manual Calculation (Pure Python)
For simple scripts without external dependencies, you can compute the weighted average using basic lists and loops (or generator expressions).
Step-by-Step Implementation:
- Verify the two lists (values and weights) have the same length.
- Multiply each value by its corresponding weight.
- Sum these products.
- Divide by the sum of the weights.
def weighted_average(values, weights):
if len(values) != len(weights):
raise ValueError("Values and weights must have equal length")
if sum(weights) == 0:
raise ZeroDivisionError("Sum of weights cannot be zero")
# 1. Multiply corresponding elements
# 2. Sum the products
weighted_sum = sum(v * w for v, w in zip(values, weights))
# 3. Sum the weights
total_weight = sum(weights)
return weighted_sum / total_weight
# Example: Grades and Credits
grades = [90, 85, 88]
credits = [3, 4, 2]
result = weighted_average(grades, credits)
print(f"Weighted Average: {result:.2f}")
Output:
Weighted Average: 87.33
Method 2: Using NumPy (Recommended)
For data analysis or large datasets, numpy.average is the standard tool. It is concise, fast, and handles multi-dimensional arrays automatically.
import numpy as np
grades = np.array([90, 85, 88])
credits = np.array([3, 4, 2])
# ✅ Correct: Using np.average
result = np.average(grades, weights=credits)
print(f"NumPy Weighted Average: {result:.2f}")
Output:
NumPy Weighted Average: 87.33
NumPy functions are generally 10x to 100x faster than pure Python loops for large datasets.
Practical Example: Calculating GPA
Let's apply this to a real-world scenario: Calculating a Semester GPA.
def weighted_average(values, weights):
if len(values) != len(weights):
raise ValueError("Values and weights must have equal length")
if sum(weights) == 0:
raise ZeroDivisionError("Sum of weights cannot be zero")
# 1. Multiply corresponding elements
# 2. Sum the products
weighted_sum = sum(v * w for v, w in zip(values, weights))
# 3. Sum the weights
total_weight = sum(weights)
return weighted_sum / total_weight
courses = [
{'name': 'Mathematics', 'grade': 4.0, 'credits': 3},
{'name': 'Physics', 'grade': 3.7, 'credits': 4},
{'name': 'History', 'grade': 3.5, 'credits': 3}
]
# Extract lists
grades = [c['grade'] for c in courses]
credits = [c['credits'] for c in courses]
# Calculate
gpa = weighted_average(grades, credits) # Using our manual function
print(f"Semester GPA: {gpa:.2f}")
Output:
Semester GPA: 3.73
Handling Errors and Edge Cases
Robust code must handle invalid inputs.
- Mismatched Lengths: Values and weights must align.
- Zero Weights: The sum of weights cannot be zero (division by zero).
- Non-Numeric Types: Inputs should be numbers.
def robust_weighted_average(values, weights):
try:
if not values or not weights:
raise ValueError("Input lists cannot be empty")
if len(values) != len(weights):
raise ValueError("List lengths mismatch")
total_weight = sum(weights)
if total_weight == 0:
return 0.0 # Or raise error depending on requirements
weighted_sum = sum(v * w for v, w in zip(values, weights))
return weighted_sum / total_weight
except (ValueError, TypeError) as e:
print(f"Error: {e}")
return None
# Test Error Case
robust_weighted_average([10, 20], [1])
Output:
Error: List lengths mismatch
Conclusion
To compute weighted averages in Python:
- Use
numpy.average(values, weights=weights)for performance and simplicity in data science contexts. - Use
sum(v*w) / sum(w)for lightweight, dependency-free scripts. - Always Validate Inputs to ensure list lengths match and weights do not sum to zero.