Python Pandas: How to Calculate Weighted Average in Pandas

A weighted average assigns different levels of importance (weights) to each value before computing the average. Unlike a simple average where all values contribute equally, a weighted average gives more influence to values with higher weights.

Formula:

Weighted Average = Σ(value × weight) / Σ(weight)

Weighted averages are essential in finance (portfolio returns), education (grade calculation), surveys (population-weighted responses), and data analysis (importance-adjusted metrics).

Basic Weighted Average Calculation

The simplest approach is to implement the formula directly using Pandas operations:

import pandas as pd

df = pd.DataFrame({
    'item': ['A', 'B', 'C', 'D'],
    'price': [100, 200, 150, 300],
    'quantity': [10, 5, 8, 3]
})

# Weighted average price (weighted by quantity)
weighted_avg = (df['price'] * df['quantity']).sum() / df['quantity'].sum()

print(f"Simple average: {df['price'].mean():.2f}")
print(f"Weighted average: {weighted_avg:.2f}")

Output:

Simple average: 187.50
Weighted average: 157.69

The weighted average (157.69) is lower than the simple average (187.50) because the cheaper items (A and C) have higher quantities (weights), pulling the average down.

Creating a Reusable Function

Wrap the calculation in a function for reusability:

import pandas as pd

def weighted_average(df, value_col, weight_col):
    """Calculate the weighted average of a column."""
    values = df[value_col]
    weights = df[weight_col]
    return (values * weights).sum() / weights.sum()

df = pd.DataFrame({
    'item': ['A', 'B', 'C', 'D'],
    'price': [100, 200, 150, 300],
    'quantity': [10, 5, 8, 3]
})

result = weighted_average(df, 'price', 'quantity')
print(f"Weighted average price: {result:.2f}")

Output:

Weighted average price: 157.69

Weighted Average by Group

A common use case is computing the weighted average for each group separately. Use groupby() with .apply():

import pandas as pd

def weighted_average(group, value_col, weight_col):
    values = group[value_col]
    weights = group[weight_col]
    return (values * weights).sum() / weights.sum()

df = pd.DataFrame({
    'store': ['North', 'North', 'North', 'South', 'South', 'South'],
    'product': ['Chocolate', 'Biscuit', 'IceCream',
                'Chocolate', 'Biscuit', 'IceCream'],
    'price': [90, 42, 68, 86, 48, 102],
    'quantity': [4, 6, 3, 3, 5, 5]
})

# Weighted average price per store
result = df.groupby('store').apply(
    weighted_average, 'price', 'quantity'
)

print(result)

Output:

store
North    62.769231
South    77.538462
dtype: float64

Weighted Average by Product

import pandas as pd

def weighted_average(group, value_col, weight_col):
    values = group[value_col]
    weights = group[weight_col]
    return (values * weights).sum() / weights.sum()

df = pd.DataFrame({
    'store': ['North', 'North', 'North', 'South', 'South', 'South'],
    'product': ['Chocolate', 'Biscuit', 'IceCream',
                'Chocolate', 'Biscuit', 'IceCream'],
    'price': [90, 42, 68, 86, 48, 102],
    'quantity': [4, 6, 3, 3, 5, 5]
})

# Weighted average price per product (weighted by quantity)
result = df.groupby('product').apply(
    weighted_average, 'price', 'quantity'
)

print(result)

Output:

product
Biscuit      44.727273
Chocolate    88.285714
IceCream     89.250000
dtype: float64

Using `numpy.average()` with Pandas

NumPy's average() function has a built-in weights parameter that makes this even simpler:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'item': ['A', 'B', 'C', 'D'],
    'price': [100, 200, 150, 300],
    'quantity': [10, 5, 8, 3]
})

# Single weighted average
result = np.average(df['price'], weights=df['quantity'])
print(f"Weighted average: {result:.2f}")

Output:

Weighted average: 157.69

With `groupby()`

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'store': ['North', 'North', 'South', 'South'],
    'price': [100, 150, 200, 120],
    'quantity': [10, 5, 3, 8]
})

result = df.groupby('store').apply(
    lambda g: np.average(g['price'], weights=g['quantity'])
)

print(result)

Output:

store
North    116.666667
South    141.818182
dtype: float64

Alternative: Using `groupby()` Without `apply()`

For better performance on large DataFrames, avoid apply() and use vectorized operations:

import pandas as pd

df = pd.DataFrame({
    'product': ['Chocolate', 'Chocolate', 'Chocolate',
                'Biscuit', 'Biscuit', 'Biscuit'],
    'price': [90, 50, 86, 87, 42, 48],
    'weight': [4, 2, 3, 5, 6, 5]
})

# Compute weighted sum and total weight per group
df['weighted_value'] = df['price'] * df['weight']

result = (
    df.groupby('product')['weighted_value'].sum()
    / df.groupby('product')['weight'].sum()
)

print(result)

Output:

product
Biscuit      57.937500
Chocolate    79.777778
dtype: float64

Performance tip

This vectorized approach is significantly faster than using apply() for large DataFrames because it avoids Python-level function calls for each group.

Practical Example: Student Grade Calculation

A classic weighted average use case is calculating a student's final grade where different assignments carry different weights:

import pandas as pd
import numpy as np

grades = pd.DataFrame({
    'student': ['Alice', 'Alice', 'Alice', 'Bob', 'Bob', 'Bob'],
    'category': ['Homework', 'Midterm', 'Final',
                 'Homework', 'Midterm', 'Final'],
    'score': [92, 85, 78, 75, 88, 95],
    'weight': [20, 30, 50, 20, 30, 50]
})

# Calculate weighted average grade per student
result = grades.groupby('student').apply(
    lambda g: np.average(g['score'], weights=g['weight'])
)

print("Weighted Final Grades:")
print(result)

# Compare with simple average
simple_avg = grades.groupby('student')['score'].mean()
print("\nSimple Average Grades:")
print(simple_avg)

Output:

Weighted Final Grades:
student
Alice    82.9
Bob      88.9
dtype: float64

Simple Average Grades:
student
Alice    85.0
Bob      86.0
Name: score, dtype: float64

Alice's weighted grade (83.9) is lower than her simple average (85.0) because her weakest score was on the Final, which carries the highest weight (50%). Bob's weighted grade is higher because he performed best on the Final.

Practical Example: Portfolio Weighted Return

import pandas as pd
import numpy as np

portfolio = pd.DataFrame({
    'stock': ['AAPL', 'GOOGL', 'MSFT', 'AMZN'],
    'return_pct': [12.5, 8.3, 15.2, -3.1],
    'investment': [50000, 30000, 40000, 20000]
})

# Portfolio weighted return
weighted_return = np.average(
    portfolio['return_pct'],
    weights=portfolio['investment']
)

simple_return = portfolio['return_pct'].mean()

print(f"Portfolio weighted return: {weighted_return:.2f}%")
print(f"Simple average return: {simple_return:.2f}%")

Output:

Portfolio weighted return: 10.14%
Simple average return: 8.22%

Handling Edge Cases

Watch out for zero weights

If all weights in a group sum to zero, you'll get a division-by-zero error:

import pandas as pd

df = pd.DataFrame({
    'value': [10, 20],
    'weight': [0, 0]  # Both weights are zero!
})

# This will cause a ZeroDivisionError or return NaN
result = (df['value'] * df['weight']).sum() / df['weight'].sum()
# RuntimeWarning: invalid value encountered in scalar divide

Fix: add a check

def safe_weighted_average(df, value_col, weight_col):
    weights_sum = df[weight_col].sum()
    if weights_sum == 0:
        return float('nan')
    return (df[value_col] * df[weight_col]).sum() / weights_sum

Comparison of Methods

Method	Grouped Support	Performance	Readability
Manual formula	With `apply()`	⭐⭐⭐	⭐⭐⭐⭐⭐
`numpy.average()`	With `apply()`	⭐⭐⭐	⭐⭐⭐⭐⭐
Vectorized `groupby()`	Built-in	⭐⭐⭐⭐⭐	⭐⭐⭐

Conclusion

Calculating weighted averages in Pandas is straightforward using the formula (values × weights).sum() / weights.sum().

For single calculations, numpy.average() with the weights parameter is the cleanest approach.
For grouped weighted averages, combine groupby() with either apply() for clarity or vectorized operations for performance.
Always handle the edge case of zero total weights to prevent division errors.
Weighted averages give you a more accurate picture of your data by accounting for the relative importance of each observation.

Basic Weighted Average Calculation​

Creating a Reusable Function​

Weighted Average by Group​

Weighted Average by Product​

Using numpy.average() with Pandas​

With groupby()​

Alternative: Using groupby() Without apply()​

Practical Example: Student Grade Calculation​

Practical Example: Portfolio Weighted Return​

Handling Edge Cases​

Fix: add a check​

Comparison of Methods​

Conclusion​

Table of Contents

Basic Weighted Average Calculation

Creating a Reusable Function

Weighted Average by Group

Weighted Average by Product

Using `numpy.average()` with Pandas

With `groupby()`

Alternative: Using `groupby()` Without `apply()`

Practical Example: Student Grade Calculation

Practical Example: Portfolio Weighted Return

Handling Edge Cases

Fix: add a check

Comparison of Methods

Conclusion