Python NumPy: How to Calculate Weighted Average in NumPy
Weighted averages assign different levels of importance to different values, making them essential for tasks like grade calculations, financial portfolio analysis, and statistical modeling. Unlike a simple average where every data point contributes equally, a weighted average lets you control how much each value influences the final result.
In this guide, you will learn how to calculate weighted averages using NumPy's built-in np.average() function, handle multidimensional arrays, and apply weighted averages to real-world scenarios.
Basic Weighted Average with np.average()
The simplest way to compute a weighted average in NumPy is by using the np.average() function with the weights parameter:
import numpy as np
scores = np.array([90, 80, 70]) # Homework, Midterm, Final
weights = np.array([0.1, 0.3, 0.6]) # 10%, 30%, 60%
grade = np.average(scores, weights=weights)
print(f"Weighted grade: {grade}")
Output:
Weighted grade: 75.0
How the Formula Works
The weighted average formula is:
Weighted Average = Σ(value × weight) / Σ(weights)
= (90 × 0.1 + 80 × 0.3 + 70 × 0.6) / (0.1 + 0.3 + 0.6)
= (9 + 24 + 42) / 1.0
= 75.0
Each value is multiplied by its corresponding weight, and the sum of those products is divided by the total sum of the weights. This ensures that values with higher weights have a greater impact on the result.
Manual Calculation for Verification
Understanding the underlying formula is useful for debugging and verifying results. Here is how you can compute a weighted average manually using basic NumPy operations:
import numpy as np
scores = np.array([90, 80, 70])
weights = np.array([0.1, 0.3, 0.6])
# Manual approach
weighted_sum = np.sum(scores * weights)
weight_total = np.sum(weights)
manual_avg = weighted_sum / weight_total
print(f"Manual: {manual_avg}")
print(f"np.average: {np.average(scores, weights=weights)}")
Output:
Manual: 75.0
np.average: 75.0
Both approaches produce identical results. The manual method is helpful when you need to customize the calculation or integrate it into a larger pipeline.
Weights Do Not Need to Sum to 1
A common misconception is that weights must add up to 1. NumPy normalizes the weights automatically by dividing by their sum, so only the relative proportions between weights matter:
import numpy as np
values = np.array([10, 20, 30])
# Weights as percentages
weights1 = np.array([10, 30, 60]) # Sum = 100
# Weights as simple counts
weights2 = np.array([1, 3, 6]) # Sum = 10
# Both produce the same result
print(np.average(values, weights=weights1))
print(np.average(values, weights=weights2))
Output:
25.0
25.0
You can use any positive numbers as weights. NumPy divides by the sum of weights internally, so proportions are what matter, not absolute values.
Weighted Average vs. Simple Average
To understand the effect of weighting, compare the weighted average with the simple (unweighted) average:
import numpy as np
scores = np.array([90, 80, 70])
weights = np.array([0.1, 0.3, 0.6])
# Simple (unweighted) average
simple_avg = np.mean(scores)
print(f"Simple average: {simple_avg}")
# Weighted average
weighted_avg = np.average(scores, weights=weights)
print(f"Weighted average: {weighted_avg}")
Output:
Simple average: 80.0
Weighted average: 75.0
Because the final exam score (70) carries 60% of the total weight, it pulls the weighted average down compared to the simple average. This behavior is exactly what makes weighted averages useful: they reflect the true importance of each value.
Weighted Average with 2D Arrays
NumPy supports weighted averages along specific axes of multidimensional arrays, which is especially useful when working with tabular data.
Weighted Average Along Rows
import numpy as np
# 3 students, 3 assignments each
data = np.array([
[90, 85, 80], # Student 1
[70, 75, 80], # Student 2
[95, 90, 85] # Student 3
])
weights = np.array([0.2, 0.3, 0.5]) # Assignment weights
# Weighted average per student (across columns)
student_grades = np.average(data, axis=1, weights=weights)
print(f"Student grades: {student_grades}")
Output:
Student grades: [83.5 76.5 88.5]
By setting axis=1, the function computes the weighted average across columns for each row (student).
Weighted Average Along Columns
import numpy as np
data = np.array([
[100, 200],
[150, 250],
[200, 300]
])
# Weight by row (e.g., more recent time periods weighted higher)
row_weights = np.array([1, 2, 3])
# Weighted average per column
col_avgs = np.average(data, axis=0, weights=row_weights)
print(f"Column averages: {col_avgs}")
Output:
Column averages: [166.66666667 266.66666667]
Setting axis=0 computes the weighted average down each column, applying the row-level weights.
Practical Examples
Grade Calculation
import numpy as np
def calculate_grade(homework, midterm, final):
"""Calculate weighted course grade."""
scores = np.array([homework, midterm, final])
weights = np.array([0.20, 0.30, 0.50]) # 20%, 30%, 50%
return np.average(scores, weights=weights)
grade = calculate_grade(homework=85, midterm=78, final=92)
print(f"Final grade: {grade:.1f}")
Output:
Final grade: 86.4
Financial Portfolio Returns
import numpy as np
# Individual stock returns
returns = np.array([0.12, 0.08, -0.05, 0.15]) # 12%, 8%, -5%, 15%
# Portfolio allocation percentages
allocation = np.array([0.40, 0.30, 0.20, 0.10]) # 40%, 30%, 20%, 10%
portfolio_return = np.average(returns, weights=allocation)
print(f"Portfolio return: {portfolio_return:.2%}")
Output:
Portfolio return: 7.70%
Weighted Moving Average
A weighted moving average (WMA) gives more influence to recent data points, making it popular in time-series analysis and financial charting:
import numpy as np
def weighted_moving_average(data, window=3):
"""Calculate WMA where recent values have more weight."""
weights = np.arange(1, window + 1) # [1, 2, 3] for window=3
result = []
for i in range(window - 1, len(data)):
window_data = data[i - window + 1:i + 1]
wma = np.average(window_data, weights=weights)
result.append(wma)
return np.array(result)
prices = np.array([10, 12, 11, 13, 15, 14, 16])
wma = weighted_moving_average(prices, window=3)
print(f"WMA: {np.round(wma, 2)}")
Output:
WMA: [11.17 12.17 13.67 14.17 15.17]
Survey Ratings with Confidence Weights
When survey respondents report different levels of confidence, you can weight their ratings accordingly:
import numpy as np
# Survey ratings on a 1-5 scale
ratings = np.array([5, 3, 4, 2, 5, 4])
# Respondent confidence (self-reported, 1-10 scale)
confidence = np.array([8, 5, 7, 3, 9, 6])
weighted_rating = np.average(ratings, weights=confidence)
simple_rating = np.mean(ratings)
print(f"Simple average: {simple_rating:.2f}")
print(f"Confidence-weighted: {weighted_rating:.2f}")
Output:
Simple average: 3.83
Confidence-weighted: 4.16
High-confidence positive ratings pull the weighted average upward, giving a more reliable representation of the data.
Handling Edge Cases
Zero Weights
Setting a weight to zero effectively excludes that value from the calculation:
import numpy as np
values = np.array([10, 20, 30])
weights = np.array([1, 0, 1]) # Middle value excluded
avg = np.average(values, weights=weights)
print(avg)
Output:
20.0
Only the first and last values contribute to the result.
Negative Weights
NumPy does not prevent you from using negative weights, but the results can be misleading:
import numpy as np
values = np.array([10, 20, 30])
weights = np.array([1, -1, 1]) # Negative weight on the second value
avg = np.average(values, weights=weights)
print(avg)
Output:
20.0
Negative weights are mathematically valid in NumPy but are rarely meaningful. Always ensure your weights represent actual importance, frequency, or allocation. Using negative weights can produce unexpected or nonsensical results in most real-world applications.
Mismatched Array Shapes
A common mistake is passing weight and value arrays of different lengths:
import numpy as np
values = np.array([10, 20, 30])
weights = np.array([1, 2]) # Only 2 weights for 3 values
try:
avg = np.average(values, weights=weights)
except TypeError as e:
print(f"Error: {e}")
Output:
TypeError: Axis must be specified when shapes of a and weights differ.
Always make sure the weights array matches the shape of the data array along the relevant axis.
Returning the Sum of Weights
The returned=True parameter makes np.average() return both the weighted average and the total weight. This is useful for combining weighted averages from different datasets:
import numpy as np
values = np.array([10, 20, 30])
weights = np.array([2, 3, 5])
avg, weight_sum = np.average(values, weights=weights, returned=True)
print(f"Weighted average: {avg}")
print(f"Total weight: {weight_sum}")
Output:
Weighted average: 23.0
Total weight: 10.0
Quick Reference
| Parameter | Description | Example |
|---|---|---|
a | Input array | np.array([90, 80, 70]) |
weights | Weight array (must match shape) | np.array([0.1, 0.3, 0.6]) |
axis | Axis for multidimensional arrays | axis=1 for row-wise |
returned | Also return sum of weights | returned=True |
| Function | Purpose |
|---|---|
np.average(a, weights=w) | Weighted average |
np.mean(a) | Simple (unweighted) average |
Use np.average(data, weights=weights) whenever your data points carry different levels of importance. NumPy handles the normalization for you, so you can pass any positive numbers as weights and focus on expressing the correct relative proportions.