Python PyTorch: How to Perform Element-Wise Division on Tensors

Element-wise division is a core tensor operation in deep learning and scientific computing. It's used in tasks like normalizing feature maps, computing attention weights, scaling gradients, and implementing custom loss functions. PyTorch provides the torch.div() function to divide corresponding elements of two tensors, producing a new tensor with the results.

In this guide, you'll learn how to perform element-wise division using torch.div(), understand rounding modes, handle broadcasting with different tensor shapes, and avoid common pitfalls like division by zero.

Understanding `torch.div()`

The torch.div() function divides each element of the first tensor (dividend) by the corresponding element of the second tensor (divisor):

torch.div(input, other, *, rounding_mode=None, out=None)

Parameters:

Parameter	Description
`input`	The dividend tensor
`other`	The divisor tensor or scalar
`rounding_mode`	Optional rounding: `None` (true division), `'trunc'` (truncation), or `'floor'` (floor division)
`out`	Optional pre-allocated output tensor

Returns: A new tensor containing the element-wise quotient.

Dividing Two 1D Tensors

The simplest case divides corresponding elements of two tensors with the same shape:

import torch

A = torch.tensor([10.0, 25.0, 30.0, -15.0])
B = torch.tensor([2.0, -5.0, 10.0, 3.0])

result = torch.div(A, B)

print("Tensor A:", A)
print("Tensor B:", B)
print("A / B:   ", result)

Output:

Tensor A: tensor([ 10.,  25.,  30., -15.])
Tensor B: tensor([ 2., -5., 10.,  3.])
A / B:    tensor([ 5., -5.,  3., -5.])

You can also use the / operator, which produces identical results:

result = A / B
print("A / B:", result)

Output:

A / B: tensor([ 5., -5.,  3., -5.])

Dividing Two 2D Tensors

Element-wise division works the same way with multi-dimensional tensors - each element at position [i][j] in the dividend is divided by the element at the same position in the divisor:

import torch

a = torch.tensor([[10.0, 20.0],
                  [30.0, 40.0]])

b = torch.tensor([[2.0, 5.0],
                  [6.0, 8.0]])

result = torch.div(a, b)

print("Tensor a:")
print(a)
print("\nTensor b:")
print(b)
print("\nResult (a / b):")
print(result)

Output:

Tensor a:
tensor([[10., 20.],
        [30., 40.]])

Tensor b:
tensor([[2., 5.],
        [6., 8.]])

Result (a / b):
tensor([[5., 4.],
        [5., 5.]])

Understanding Rounding Modes

By default, torch.div() performs true division (returns floating-point results). You can control the rounding behavior with the rounding_mode parameter:

Rounding Mode	Behavior	Example: `-7 / 2`
`None` (default)	True division, no rounding	`-3.5`
`'trunc'`	Rounds toward zero (like C integer division)	`-3.0`
`'floor'`	Rounds toward negative infinity (like Python `//`)	`-4.0`

import torch

a = torch.tensor([7.0, -7.0, 7.0, -7.0])
b = torch.tensor([2.0, 2.0, -2.0, -2.0])

true_div = torch.div(a, b)
trunc_div = torch.div(a, b, rounding_mode='trunc')
floor_div = torch.div(a, b, rounding_mode='floor')

print("a:          ", a)
print("b:          ", b)
print("True div:   ", true_div)
print("Trunc div:  ", trunc_div)
print("Floor div:  ", floor_div)

Output:

a:           tensor([ 7., -7.,  7., -7.])
b:           tensor([ 2.,  2., -2., -2.])
True div:    tensor([ 3.5000, -3.5000, -3.5000,  3.5000])
Trunc div:   tensor([ 3., -3., -3.,  3.])
Floor div:   tensor([ 3., -4., -4.,  3.])

When to use each rounding mode

None (default): Use for general-purpose division where you need precise floating-point results.
'trunc': Use when you want integer-like division that rounds toward zero, matching C/C++ behavior.
'floor': Use when you want Python-style floor division (//), which always rounds down.

Notice the difference with negative numbers: trunc rounds -3.5 to -3, while floor rounds it to -4.

Broadcasting: Dividing Tensors of Different Shapes

PyTorch supports broadcasting, allowing division between tensors of different dimensions. The smaller tensor is automatically expanded to match the larger one:

3D Tensor Divided by a 1D Tensor

import torch

# 3D tensor with shape (2, 2, 3)
a = torch.tensor([[[6.0, 12.0, 18.0],
                   [24.0, 30.0, 36.0]],
                  [[9.0, 15.0, 21.0],
                   [27.0, 33.0, 39.0]]])

# 1D tensor with shape (3,)
b = torch.tensor([3.0, 6.0, 9.0])

result = torch.div(a, b)

print("Tensor a shape:", a.shape)
print("Tensor b shape:", b.shape)
print("Result shape:  ", result.shape)
print("\nResult:")
print(result)

Output:

Tensor a shape: torch.Size([2, 2, 3])
Tensor b shape: torch.Size([3])
Result shape:   torch.Size([2, 2, 3])

Result:
tensor([[[2.0000, 2.0000, 2.0000],
         [8.0000, 5.0000, 4.0000]],

        [[3.0000, 2.5000, 2.3333],
         [9.0000, 5.5000, 4.3333]]])

The 1D tensor [3, 6, 9] is broadcast across all rows and all "layers" of the 3D tensor.

2D Tensor Divided by a Column Vector

import torch

a = torch.tensor([[10.0, 20.0, 30.0],
                  [40.0, 50.0, 60.0]])

# Column vector with shape (2, 1)
b = torch.tensor([[10.0],
                  [20.0]])

result = torch.div(a, b)

print("Tensor a shape:", a.shape)
print("Tensor b shape:", b.shape)
print("\nResult:")
print(result)

Output:

Tensor a shape: torch.Size([2, 3])
Tensor b shape: torch.Size([2, 1])

Result:
tensor([[1.0000, 2.0000, 3.0000],
        [2.0000, 2.5000, 3.0000]])

Dividing by a Scalar

You can divide every element of a tensor by a single number:

import torch

tens = torch.tensor([[10.0, 20.0],
                     [30.0, 40.0]])

result = torch.div(tens, 5)
print("Tensor / 5:")
print(result)

Output:

Tensor / 5:
tensor([[2., 4.],
        [6., 8.]])

Common Mistake: Division by Zero

Dividing by zero doesn't raise an error in PyTorch - instead, it produces inf (infinity) or nan (not a number), which can silently corrupt your calculations downstream:

import torch

a = torch.tensor([10.0, 0.0, -5.0])
b = torch.tensor([0.0, 0.0, 0.0])

# ❌ No error, but results are problematic
result = torch.div(a, b)
print("Result:", result)

Output:

Result: tensor([inf, nan, -inf])

These values propagate through subsequent operations and can ruin model training. Always check for zeros before dividing:

import torch

a = torch.tensor([10.0, 0.0, -5.0])
b = torch.tensor([0.0, 2.0, 0.0])

# ✅ Replace zeros with a small epsilon to avoid inf/nan
epsilon = 1e-8
safe_b = torch.where(b == 0, torch.tensor(epsilon), b)

result = torch.div(a, safe_b)
print("Safe result:", result)

Output:

Safe result: tensor([ 1.0000e+09,  0.0000e+00, -5.0000e+08])

warning

In neural network training, division by zero often occurs when normalizing by a standard deviation that happens to be zero, or when computing ratios with sparse tensors. Always add a small epsilon value (e.g., 1e-8) to the divisor to prevent inf and nan values from corrupting your model.

In-Place Division with `div_()`

To modify a tensor in place without allocating new memory, use the div_() method:

import torch

tens = torch.tensor([10.0, 20.0, 30.0])
print("Before:", tens)

tens.div_(5)
print("After: ", tens)

Output:

Before: tensor([10., 20., 30.])
After:  tensor([2., 4., 6.])

warning

Avoid in-place operations on tensors that participate in autograd computation graphs. They can cause errors during backpropagation:

x = torch.tensor([10.0], requires_grad=True)
x.div_(2)  # RuntimeError: in-place operation on a leaf Variable

`torch.div()` vs `/` Operator

Feature	`torch.div()`	`/` Operator
Basic division	✅	✅
`rounding_mode` parameter	✅	❌
`out` parameter (pre-allocated output)	✅	❌
Readability	Explicit	Concise

Use torch.div() when you need rounding modes or an output tensor. Use / for simple, readable division.

Summary

Element-wise division in PyTorch is performed with torch.div() or the / operator. Both support tensors of any dimension and leverage broadcasting for tensors with different shapes.

Use the rounding_mode parameter ('trunc' or 'floor') when you need integer-like division behavior.

Always guard against division by zero by adding a small epsilon to the divisor, and avoid in-place operations (div_()) on tensors that require gradient computation.

Understanding torch.div()​

Dividing Two 1D Tensors​

Dividing Two 2D Tensors​

Understanding Rounding Modes​

Broadcasting: Dividing Tensors of Different Shapes​

3D Tensor Divided by a 1D Tensor​

2D Tensor Divided by a Column Vector​

Dividing by a Scalar​

Common Mistake: Division by Zero​

In-Place Division with div_()​

torch.div() vs / Operator​

Summary​

Table of Contents

Understanding `torch.div()`

Dividing Two 1D Tensors

Dividing Two 2D Tensors

Understanding Rounding Modes

Broadcasting: Dividing Tensors of Different Shapes

3D Tensor Divided by a 1D Tensor

2D Tensor Divided by a Column Vector

Dividing by a Scalar

Common Mistake: Division by Zero

In-Place Division with `div_()`

`torch.div()` vs `/` Operator

Summary