Skip to main content

Python NumPy: How to Calculate Weighted Average in NumPy

Weighted averages assign different levels of importance to different values, making them essential for tasks like grade calculations, financial portfolio analysis, and statistical modeling. Unlike a simple average where every data point contributes equally, a weighted average lets you control how much each value influences the final result.

In this guide, you will learn how to calculate weighted averages using NumPy's built-in np.average() function, handle multidimensional arrays, and apply weighted averages to real-world scenarios.

Basic Weighted Average with np.average()

The simplest way to compute a weighted average in NumPy is by using the np.average() function with the weights parameter:

import numpy as np

scores = np.array([90, 80, 70]) # Homework, Midterm, Final
weights = np.array([0.1, 0.3, 0.6]) # 10%, 30%, 60%

grade = np.average(scores, weights=weights)
print(f"Weighted grade: {grade}")

Output:

Weighted grade: 75.0

How the Formula Works

The weighted average formula is:

Weighted Average = Σ(value × weight) / Σ(weights)

= (90 × 0.1 + 80 × 0.3 + 70 × 0.6) / (0.1 + 0.3 + 0.6)
= (9 + 24 + 42) / 1.0
= 75.0

Each value is multiplied by its corresponding weight, and the sum of those products is divided by the total sum of the weights. This ensures that values with higher weights have a greater impact on the result.

Manual Calculation for Verification

Understanding the underlying formula is useful for debugging and verifying results. Here is how you can compute a weighted average manually using basic NumPy operations:

import numpy as np

scores = np.array([90, 80, 70])
weights = np.array([0.1, 0.3, 0.6])

# Manual approach
weighted_sum = np.sum(scores * weights)
weight_total = np.sum(weights)
manual_avg = weighted_sum / weight_total

print(f"Manual: {manual_avg}")
print(f"np.average: {np.average(scores, weights=weights)}")

Output:

Manual: 75.0
np.average: 75.0

Both approaches produce identical results. The manual method is helpful when you need to customize the calculation or integrate it into a larger pipeline.

Weights Do Not Need to Sum to 1

A common misconception is that weights must add up to 1. NumPy normalizes the weights automatically by dividing by their sum, so only the relative proportions between weights matter:

import numpy as np

values = np.array([10, 20, 30])

# Weights as percentages
weights1 = np.array([10, 30, 60]) # Sum = 100

# Weights as simple counts
weights2 = np.array([1, 3, 6]) # Sum = 10

# Both produce the same result
print(np.average(values, weights=weights1))
print(np.average(values, weights=weights2))

Output:

25.0
25.0
tip

You can use any positive numbers as weights. NumPy divides by the sum of weights internally, so proportions are what matter, not absolute values.

Weighted Average vs. Simple Average

To understand the effect of weighting, compare the weighted average with the simple (unweighted) average:

import numpy as np

scores = np.array([90, 80, 70])
weights = np.array([0.1, 0.3, 0.6])

# Simple (unweighted) average
simple_avg = np.mean(scores)
print(f"Simple average: {simple_avg}")

# Weighted average
weighted_avg = np.average(scores, weights=weights)
print(f"Weighted average: {weighted_avg}")

Output:

Simple average: 80.0
Weighted average: 75.0

Because the final exam score (70) carries 60% of the total weight, it pulls the weighted average down compared to the simple average. This behavior is exactly what makes weighted averages useful: they reflect the true importance of each value.

Weighted Average with 2D Arrays

NumPy supports weighted averages along specific axes of multidimensional arrays, which is especially useful when working with tabular data.

Weighted Average Along Rows

import numpy as np

# 3 students, 3 assignments each
data = np.array([
[90, 85, 80], # Student 1
[70, 75, 80], # Student 2
[95, 90, 85] # Student 3
])

weights = np.array([0.2, 0.3, 0.5]) # Assignment weights

# Weighted average per student (across columns)
student_grades = np.average(data, axis=1, weights=weights)
print(f"Student grades: {student_grades}")

Output:

Student grades: [83.5 76.5 88.5]

By setting axis=1, the function computes the weighted average across columns for each row (student).

Weighted Average Along Columns

import numpy as np

data = np.array([
[100, 200],
[150, 250],
[200, 300]
])

# Weight by row (e.g., more recent time periods weighted higher)
row_weights = np.array([1, 2, 3])

# Weighted average per column
col_avgs = np.average(data, axis=0, weights=row_weights)
print(f"Column averages: {col_avgs}")

Output:

Column averages: [166.66666667 266.66666667]

Setting axis=0 computes the weighted average down each column, applying the row-level weights.

Practical Examples

Grade Calculation

import numpy as np

def calculate_grade(homework, midterm, final):
"""Calculate weighted course grade."""
scores = np.array([homework, midterm, final])
weights = np.array([0.20, 0.30, 0.50]) # 20%, 30%, 50%
return np.average(scores, weights=weights)

grade = calculate_grade(homework=85, midterm=78, final=92)
print(f"Final grade: {grade:.1f}")

Output:

Final grade: 86.4

Financial Portfolio Returns

import numpy as np

# Individual stock returns
returns = np.array([0.12, 0.08, -0.05, 0.15]) # 12%, 8%, -5%, 15%

# Portfolio allocation percentages
allocation = np.array([0.40, 0.30, 0.20, 0.10]) # 40%, 30%, 20%, 10%

portfolio_return = np.average(returns, weights=allocation)
print(f"Portfolio return: {portfolio_return:.2%}")

Output:

Portfolio return: 7.70%

Weighted Moving Average

A weighted moving average (WMA) gives more influence to recent data points, making it popular in time-series analysis and financial charting:

import numpy as np

def weighted_moving_average(data, window=3):
"""Calculate WMA where recent values have more weight."""
weights = np.arange(1, window + 1) # [1, 2, 3] for window=3

result = []
for i in range(window - 1, len(data)):
window_data = data[i - window + 1:i + 1]
wma = np.average(window_data, weights=weights)
result.append(wma)

return np.array(result)

prices = np.array([10, 12, 11, 13, 15, 14, 16])
wma = weighted_moving_average(prices, window=3)
print(f"WMA: {np.round(wma, 2)}")

Output:

WMA: [11.17 12.17 13.67 14.17 15.17]

Survey Ratings with Confidence Weights

When survey respondents report different levels of confidence, you can weight their ratings accordingly:

import numpy as np

# Survey ratings on a 1-5 scale
ratings = np.array([5, 3, 4, 2, 5, 4])

# Respondent confidence (self-reported, 1-10 scale)
confidence = np.array([8, 5, 7, 3, 9, 6])

weighted_rating = np.average(ratings, weights=confidence)
simple_rating = np.mean(ratings)

print(f"Simple average: {simple_rating:.2f}")
print(f"Confidence-weighted: {weighted_rating:.2f}")

Output:

Simple average: 3.83
Confidence-weighted: 4.16

High-confidence positive ratings pull the weighted average upward, giving a more reliable representation of the data.

Handling Edge Cases

Zero Weights

Setting a weight to zero effectively excludes that value from the calculation:

import numpy as np

values = np.array([10, 20, 30])
weights = np.array([1, 0, 1]) # Middle value excluded

avg = np.average(values, weights=weights)
print(avg)

Output:

20.0

Only the first and last values contribute to the result.

Negative Weights

NumPy does not prevent you from using negative weights, but the results can be misleading:

import numpy as np

values = np.array([10, 20, 30])
weights = np.array([1, -1, 1]) # Negative weight on the second value

avg = np.average(values, weights=weights)
print(avg)

Output:

20.0
warning

Negative weights are mathematically valid in NumPy but are rarely meaningful. Always ensure your weights represent actual importance, frequency, or allocation. Using negative weights can produce unexpected or nonsensical results in most real-world applications.

Mismatched Array Shapes

A common mistake is passing weight and value arrays of different lengths:

import numpy as np

values = np.array([10, 20, 30])
weights = np.array([1, 2]) # Only 2 weights for 3 values

try:
avg = np.average(values, weights=weights)
except TypeError as e:
print(f"Error: {e}")

Output:

TypeError: Axis must be specified when shapes of a and weights differ.

Always make sure the weights array matches the shape of the data array along the relevant axis.

Returning the Sum of Weights

The returned=True parameter makes np.average() return both the weighted average and the total weight. This is useful for combining weighted averages from different datasets:

import numpy as np

values = np.array([10, 20, 30])
weights = np.array([2, 3, 5])

avg, weight_sum = np.average(values, weights=weights, returned=True)
print(f"Weighted average: {avg}")
print(f"Total weight: {weight_sum}")

Output:

Weighted average: 23.0
Total weight: 10.0

Quick Reference

ParameterDescriptionExample
aInput arraynp.array([90, 80, 70])
weightsWeight array (must match shape)np.array([0.1, 0.3, 0.6])
axisAxis for multidimensional arraysaxis=1 for row-wise
returnedAlso return sum of weightsreturned=True
FunctionPurpose
np.average(a, weights=w)Weighted average
np.mean(a)Simple (unweighted) average

Use np.average(data, weights=weights) whenever your data points carry different levels of importance. NumPy handles the normalization for you, so you can pass any positive numbers as weights and focus on expressing the correct relative proportions.