Skip to main content

Python NumPy: How to Generate Random Numbers from the Normal Distribution Using NumPy

The normal distribution (also called the Gaussian distribution or bell curve) is one of the most important probability distributions in statistics and data science. It describes data that clusters around a mean value, with values becoming less frequent the further they are from the center, creating the characteristic bell-shaped curve.

NumPy's numpy.random.normal() function makes it easy to generate random numbers that follow this distribution, which is essential for simulations, statistical testing, machine learning model initialization, and synthetic data generation.

Basic Usage

The simplest way to generate random numbers from a normal distribution is calling np.random.normal() with the desired size:

import numpy as np

# Generate 5 random numbers from the standard normal distribution
values = np.random.normal(size=5)
print(values)

Output (varies each run):

[ 0.86153799  1.87815094 -0.49538872  0.21429525 -0.11574017]

By default, this generates numbers from the standard normal distribution - a normal distribution with a mean of 0 and a standard deviation of 1.

Syntax and Parameters

numpy.random.normal(loc=0.0, scale=1.0, size=None)
ParameterTypeDefaultDescription
locfloat0.0Mean (center) of the distribution
scalefloat1.0Standard deviation (spread) of the distribution
sizeint or tupleNoneOutput shape. None returns a single value

Returns: A float (if size=None) or a NumPy array of random samples drawn from the specified normal distribution.

Specifying Mean and Standard Deviation

Customize the distribution by setting loc (mean) and scale (standard deviation):

import numpy as np

# Mean = 50, Standard deviation = 10
values = np.random.normal(loc=50, scale=10, size=5)
print("Random values:", np.round(values, 2))

Output:

Random values: [49.84 33.51 58.29 39.35 45.46]

The generated numbers cluster around 50 (the mean), with most values falling within 10 units (one standard deviation) of the mean.

Understanding loc and scale
  • loc (mean) - shifts the center of the distribution. Changing it from 0 to 100 shifts all values to center around 100.
  • scale (standard deviation) - controls the spread. A larger scale produces values that are more spread out; a smaller scale keeps values tighter around the mean.

In a normal distribution:

  • ~68% of values fall within 1 standard deviation of the mean
  • ~95% fall within 2 standard deviations
  • ~99.7% fall within 3 standard deviations

Generating Multi-Dimensional Arrays

Pass a tuple to size to generate 2D or higher-dimensional arrays:

2D Array (Matrix)

import numpy as np

# 3 rows × 4 columns of normally distributed values
matrix = np.random.normal(loc=0, scale=1, size=(3, 4))
print(np.round(matrix, 3))

Output:

[[-0.344 -0.155  0.924 -0.905]
[ 1.576 1.588 0.464 0.309]
[ 0.403 0.172 1.531 1.002]]

3D Array

import numpy as np

# 2 layers × 3 rows × 4 columns
arr_3d = np.random.normal(loc=0, scale=1, size=(2, 3, 4))
print("Shape:", arr_3d.shape)

Output:

Shape: (2, 3, 4)

Reproducible Results with Random Seeds

For consistent, reproducible results (important in testing and research), set a random seed:

import numpy as np

np.random.seed(42)
values = np.random.normal(loc=0, scale=1, size=5)
print("Run 1:", np.round(values, 4))

# Same seed produces identical output
np.random.seed(42)
values = np.random.normal(loc=0, scale=1, size=5)
print("Run 2:", np.round(values, 4))

Output:

Run 1: [ 0.4967 -0.1383  0.6477  1.523  -0.2342]
Run 2: [ 0.4967 -0.1383 0.6477 1.523 -0.2342]
Use the modern Generator API

NumPy's newer Generator API (NumPy 1.17+) is recommended over the legacy np.random functions:

import numpy as np

rng = np.random.default_rng(seed=42)
values = rng.normal(loc=0, scale=1, size=5)
print(np.round(values, 4))

Output:

[ 0.3047 -1.04    0.7505  0.9406 -1.951 ]

The Generator API provides better statistical properties and is thread-safe.

Verifying the Distribution

You can verify that the generated numbers actually follow a normal distribution by checking the sample statistics:

import numpy as np

# Generate a large sample
rng = np.random.default_rng(42)
samples = rng.normal(loc=50, scale=10, size=100000)

print(f"Expected mean: 50, Sample mean: {samples.mean():.2f}")
print(f"Expected std: 10, Sample std: {samples.std():.2f}")
print(f"Min: {samples.min():.2f}")
print(f"Max: {samples.max():.2f}")

# Verify the 68-95-99.7 rule
within_1_std = np.sum(np.abs(samples - 50) <= 10) / len(samples) * 100
within_2_std = np.sum(np.abs(samples - 50) <= 20) / len(samples) * 100
within_3_std = np.sum(np.abs(samples - 50) <= 30) / len(samples) * 100

print(f"\nWithin 1 std: {within_1_std:.1f}% (expected ~68.3%)")
print(f"Within 2 std: {within_2_std:.1f}% (expected ~95.4%)")
print(f"Within 3 std: {within_3_std:.1f}% (expected ~99.7%)")

Output:

Expected mean: 50,    Sample mean: 49.96
Expected std: 10, Sample std: 10.04
Min: 6.11
Max: 100.07

Within 1 std: 68.2% (expected ~68.3%)
Within 2 std: 95.4% (expected ~95.4%)
Within 3 std: 99.7% (expected ~99.7%)

Practical Examples

Simulating Test Scores

import numpy as np

rng = np.random.default_rng(42)

# Simulate test scores: mean = 75, std = 12
scores = rng.normal(loc=75, scale=12, size=10)

# Clip to valid range [0, 100]
scores = np.clip(scores, 0, 100)

print("Simulated test scores:", np.round(scores, 1))
print(f"Class average: {scores.mean():.1f}")

Output:

Simulated test scores: [78.7 62.5 84.  86.3 51.6 59.4 76.5 71.2 74.8 64.8]
Class average: 71.0

Generating Synthetic Height Data

import numpy as np

rng = np.random.default_rng(42)

# Adult male heights in cm: mean = 175.3, std = 7.1
heights = rng.normal(loc=175.3, scale=7.1, size=5)

for i, h in enumerate(heights, 1):
print(f"Person {i}: {h:.1f} cm")

Output:

Person 1: 177.5 cm
Person 2: 167.9 cm
Person 3: 180.6 cm
Person 4: 182.0 cm
Person 5: 161.4 cm

Normal Distribution vs Other Distributions

FunctionDistributionShapeUse Case
np.random.normal()Normal (Gaussian)Bell curveMost natural phenomena
np.random.uniform()UniformFlatEqual probability across range
np.random.exponential()ExponentialRight-skewedTime between events
np.random.poisson()PoissonDiscreteCount of events in fixed interval
np.random.binomial()BinomialDiscreteSuccess/failure trials

Conclusion

numpy.random.normal() is a versatile function for generating random numbers that follow the bell curve distribution.

Use the loc parameter to set the mean, scale to control the spread, and size to define the output shape.

For reproducible results, always set a random seed, preferably using the modern np.random.default_rng() API.

Whether you're running Monte Carlo simulations, initializing neural network weights, or generating synthetic datasets, understanding how to generate normally distributed data is a fundamental skill in scientific Python programming.