How to Calculate Standard Deviation of a Matrix in Python

Understanding data variability is fundamental to statistical analysis and machine learning. Calculating the standard deviation of a matrix helps you measure how spread out values are from the mean, enabling better feature engineering, outlier detection, and data normalization. In this guide, you'll learn how to efficiently compute standard deviation across entire matrices, rows, and columns using NumPy, the Python's most powerful numerical computing library.

Global Standard Deviation

By default, the np.std() function treats the entire matrix as a single list of numbers and returns a single "global" standard deviation value.

import numpy as np

# Define a 3x3 matrix
matrix = np.array([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90]
])

# Calculate SD for the entire dataset
global_sd = np.std(matrix)

print(f"Global Standard Deviation: {global_sd:.2f}")

Output:

Global Standard Deviation: 25.82

Row-wise and Column-wise Calculations

For advanced feature engineering, you often need to analyze the variance across specific dimensions. You can control this using the axis parameter:

axis=0: Calculates the standard deviation of each column
axis=1: Calculates the standard deviation of each row

import numpy as np

# Define a 3x3 matrix
matrix = np.array([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90]
])

# Calculate dispersion for each vertical column
col_sd = np.std(matrix, axis=0)

# Calculate dispersion for each horizontal row
row_sd = np.std(matrix, axis=1)

print(f"Column Standard Deviations: {col_sd}")
print(f"Row Standard Deviations:    {row_sd}")

Output:

Column Standard Deviations: [24.49489743 24.49489743 24.49489743]
Row Standard Deviations:    [8.16496581 8.16496581 8.16496581]

Sample vs. Population Standard Deviation

NumPy defaults to the population standard deviation formula. If your matrix represents a sample (subset) of a larger dataset, set ddof=1 (Delta Degrees of Freedom) to apply Bessel's correction for an unbiased estimate: np.std(matrix, ddof=1).

Understanding the Axis Parameter

Thinking about axes can be confusing at first. Use this mental model:

Axis 0: Operations move down the rows, collapsing them into a single row of results
Axis 1: Operations move across the columns, collapsing them into a single column of results

Goal	Command	Output Shape
All elements	`np.std(matrix)`	Scalar
Per column	`np.std(matrix, axis=0)`	1D array (length = number of columns)
Per row	`np.std(matrix, axis=1)`	1D array (length = number of rows)

Handling Missing Data

If your matrix contains missing values (NaN), the standard np.std() function will return NaN. Use np.nanstd() instead to calculate the standard deviation while ignoring missing values:

matrix_with_nan = np.array([[1, 2, np.nan], [4, 5, 6]])
clean_sd = np.nanstd(matrix_with_nan)

By mastering axis-aware calculations in NumPy, you can extract precise variability metrics from any multi-dimensional dataset with minimal code.

Global Standard Deviation​

Row-wise and Column-wise Calculations​

Understanding the Axis Parameter​

Table of Contents

Global Standard Deviation

Row-wise and Column-wise Calculations

Understanding the Axis Parameter