Python PyTorch: How to Perform Element-Wise Addition on Tensors

Element-wise addition is one of the most fundamental operations in deep learning and numerical computing. When working with PyTorch tensors, you frequently need to add corresponding elements of two tensors together - whether for combining feature maps in neural networks, applying residual connections, or performing basic mathematical transformations.

In this guide, you'll learn how to perform element-wise addition on PyTorch tensors using torch.add() and the + operator, handle tensors of different dimensions through broadcasting, and add scalar values to tensors.

Understanding `torch.add()`

PyTorch provides the torch.add() function for element-wise addition:

torch.add(input, other, *, alpha=1, out=None)

Parameters:

Parameter	Description
`input`	The first input tensor
`other`	The second tensor or a scalar value to add
`alpha`	Optional multiplier applied to `other` before adding (default: `1`)
`out`	Optional output tensor to store the result

Returns: A new tensor containing the element-wise sum.

Adding Two 1D Tensors

The simplest case is adding two tensors of the same shape. Each element in the first tensor is added to the corresponding element in the second:

import torch

tens_1 = torch.Tensor([10, 20, 30, 40, 50])
tens_2 = torch.Tensor([1, 2, 3, 4, 5])

result = torch.add(tens_1, tens_2)
print("Tensor 1:", tens_1)
print("Tensor 2:", tens_2)
print("Result:  ", result)

Output:

Tensor 1: tensor([10., 20., 30., 40., 50.])
Tensor 2: tensor([1., 2., 3., 4., 5.])
Result:   tensor([11., 22., 33., 44., 55.])

You can also use the + operator, which produces identical results:

result = tens_1 + tens_2
print("Result:", result)

Output:

Result: tensor([11., 22., 33., 44., 55.])

Adding Two 2D Tensors

Element-wise addition works the same way with multi-dimensional tensors - each element at position [i][j] in the first tensor is added to the element at the same position in the second:

import torch

tens_1 = torch.Tensor([[1, 2], [3, 4]])
tens_2 = torch.Tensor([[10, 20], [30, 40]])

result = torch.add(tens_1, tens_2)

print("Tensor 1:")
print(tens_1)
print("\nTensor 2:")
print(tens_2)
print("\nResult:")
print(result)

Output:

Tensor 1:
tensor([[1., 2.],
        [3., 4.]])

Tensor 2:
tensor([[10., 20.],
        [30., 40.]])

Result:
tensor([[11., 22.],
        [33., 44.]])

Adding Tensors of Different Dimensions (Broadcasting)

PyTorch supports broadcasting, which allows element-wise operations on tensors with different shapes. When dimensions don't match, PyTorch automatically expands the smaller tensor to match the larger one, following specific broadcasting rules.

import torch

# 1D tensor with shape (2,)
tens_1 = torch.Tensor([1, 2])

# 2D tensor with shape (2, 2)
tens_2 = torch.Tensor([[10, 20], [30, 40]])

result = torch.add(tens_1, tens_2)

print("Tensor 1 (1D):", tens_1)
print("Tensor 1 shape:", tens_1.shape)
print("\nTensor 2 (2D):")
print(tens_2)
print("Tensor 2 shape:", tens_2.shape)
print("\nResult:")
print(result)
print("Result shape:", result.shape)

Output:

Tensor 1 (1D): tensor([1., 2.])
Tensor 1 shape: torch.Size([2])

Tensor 2 (2D):
tensor([[10., 20.],
        [30., 40.]])
Tensor 2 shape: torch.Size([2, 2])

Result:
tensor([[11., 22.],
        [31., 42.]])
Result shape: torch.Size([2, 2])

Here, the 1D tensor [1, 2] is broadcast (repeated) across each row of the 2D tensor before addition. The result always has the shape of the higher-dimensional tensor.

How Broadcasting Works

Broadcasting follows these rules:

If tensors have different numbers of dimensions, the shape of the smaller tensor is padded with 1s on the left.
Dimensions with size 1 are stretched to match the other tensor's size.
If two dimensions are different and neither is 1, broadcasting fails with an error.

For example, shapes (3,) and (2, 3) are compatible, but (3,) and (2, 4) are not.

Adding a Scalar to a Tensor

You can add a single number (scalar) to every element of a tensor:

import torch

tens_1d = torch.Tensor([1, 2, 3, 4])
tens_2d = torch.Tensor([[10, 20], [30, 40]])

result_1d = torch.add(tens_1d, 10)
result_2d = torch.add(tens_2d, 20)

print("1D tensor + 10:", result_1d)
print("2D tensor + 20:")
print(result_2d)

Output:

1D tensor + 10: tensor([11., 12., 13., 14.])
2D tensor + 20:
tensor([[30., 40.],
        [50., 60.]])

The scalar is broadcast to match the tensor's shape, effectively adding it to every element.

Using the `alpha` Parameter for Scaled Addition

The alpha parameter multiplies the second tensor before adding it. This computes input + alpha * other, which is useful for operations like weighted sums and gradient updates:

import torch

tens_1 = torch.Tensor([10, 20, 30])
tens_2 = torch.Tensor([1, 2, 3])

# Computes: tens_1 + 5 * tens_2
result = torch.add(tens_1, tens_2, alpha=5)
print("Result (input + 5 * other):", result)

Output:

Result (input + 5 * other): tensor([15., 30., 45.])

This is equivalent to tens_1 + 5 * tens_2 but can be more efficient since it avoids creating an intermediate tensor.

In-Place Addition with `add_()`

If you want to modify a tensor in place (without creating a new tensor), use the add_() method. In-place operations in PyTorch are denoted by a trailing underscore:

import torch

tens = torch.Tensor([1, 2, 3, 4])
print("Before:", tens)

tens.add_(10)
print("After in-place addition:", tens)

Output:

Before: tensor([1., 2., 3., 4.])
After in-place addition: tensor([11., 12., 13., 14.])

warning

In-place operations overwrite the original data and can cause issues with PyTorch's autograd system. Avoid using in-place operations on tensors that require gradient computation:

# ❌ This will cause an error during backpropagation
x = torch.tensor([1.0, 2.0], requires_grad=True)
x.add_(5)  # RuntimeError: a leaf Variable that requires grad is being used in an in-place operation

Use the standard torch.add() or + operator instead when working with tensors in a computation graph.

Common Mistake: Shape Mismatch

Element-wise addition requires tensors to have compatible shapes. If the shapes can't be broadcast, PyTorch raises an error:

import torch

tens_1 = torch.Tensor([1, 2, 3])       # Shape: (3,)
tens_2 = torch.Tensor([10, 20, 30, 40]) # Shape: (4,)

# ❌ Shapes (3,) and (4,) are not compatible
try:
    result = torch.add(tens_1, tens_2)
except RuntimeError as e:
    print(f"Error: {e}")

Output:

Error: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0

Fix: Ensure your tensors have matching or broadcastable shapes before performing addition. Reshape or pad tensors as needed:

import torch

tens_1 = torch.Tensor([1, 2, 3])
tens_2 = torch.Tensor([10, 20, 30, 40])

# ✅ Option 1: Slice to matching size
result = torch.add(tens_1, tens_2[:3])
print("Sliced result:", result)

# ✅ Option 2: Pad the smaller tensor
tens_1_padded = torch.nn.functional.pad(tens_1, (0, 1), value=0)
result = torch.add(tens_1_padded, tens_2)
print("Padded result:", result)

Output:

Sliced result: tensor([11., 22., 33.])
Padded result: tensor([11., 22., 33., 40.])

`torch.add()` vs `+` Operator

Feature	`torch.add()`	`+` Operator
Basic addition	✅	✅
`alpha` parameter (scaled addition)	✅	❌
`out` parameter (pre-allocated output)	✅	❌
Readability	Explicit	Concise

Use torch.add() when you need the alpha or out parameters. Use + for simple, readable addition.

Summary

Element-wise addition in PyTorch is straightforward with either torch.add() or the + operator.

Both support tensors of any dimension and leverage broadcasting to handle tensors of different shapes automatically.

Use the alpha parameter for scaled addition, avoid in-place operations (add_()) on tensors that require gradients, and always verify that your tensor shapes are compatible to prevent runtime errors.

Understanding torch.add()​

Adding Two 1D Tensors​

Adding Two 2D Tensors​

Adding Tensors of Different Dimensions (Broadcasting)​

Adding a Scalar to a Tensor​

Using the alpha Parameter for Scaled Addition​

In-Place Addition with add_()​

Common Mistake: Shape Mismatch​

torch.add() vs + Operator​

Summary​

Table of Contents

Understanding `torch.add()`

Adding Two 1D Tensors

Adding Two 2D Tensors

Adding Tensors of Different Dimensions (Broadcasting)

Adding a Scalar to a Tensor

Using the `alpha` Parameter for Scaled Addition

In-Place Addition with `add_()`

Common Mistake: Shape Mismatch

`torch.add()` vs `+` Operator

Summary