Skip to main content

Python PyTorch: How to Use a DataLoader in PyTorch

When training deep learning models, loading an entire dataset into memory at once is often impractical - datasets can be gigabytes in size, and processing them sequentially is slow. PyTorch's DataLoader solves both problems by automatically batching, shuffling, and parallelizing the data loading process.

This guide explains how to create custom datasets, configure DataLoaders, and use them effectively in training loops.

What a DataLoader Does

A DataLoader wraps a dataset and provides an iterable that yields batches of data. Instead of manually slicing your data into batches and shuffling between epochs, the DataLoader handles this automatically:

Full Dataset (10,000 samples)
↓ DataLoader(batch_size=32, shuffle=True)

Epoch 1: [Batch 1: 32 samples] → [Batch 2: 32 samples] → ... → [Batch 313: 8 samples]
Epoch 2: [Batch 1: 32 samples (different order)] → ...

DataLoader Syntax

from torch.utils.data import DataLoader

DataLoader(
dataset,
batch_size=1,
shuffle=False,
num_workers=0,
drop_last=False,
pin_memory=False
)
ParameterDescriptionDefault
datasetThe dataset to load (required)Required
batch_sizeNumber of samples per batch1
shuffleWhether to randomize order each epochFalse
num_workersNumber of subprocesses for parallel loading0 (main process)
drop_lastDrop the last incomplete batch if the dataset isn't evenly divisibleFalse
pin_memoryCopy tensors to CUDA pinned memory for faster GPU transferFalse

Creating a Custom Dataset

To use a DataLoader, you first need a dataset. Custom datasets extend torch.utils.data.Dataset and must implement two methods:

  • __len__() - returns the total number of samples
  • __getitem__(index) - returns a single sample at the given index
import torch
from torch.utils.data import Dataset, DataLoader


class NumberDataset(Dataset):
"""A simple dataset containing numbers 0 to 99."""

def __init__(self):
self.data = list(range(100))

def __len__(self):
return len(self.data)

def __getitem__(self, index):
return self.data[index]


dataset = NumberDataset()
print(f"Dataset size: {len(dataset)}")
print(f"Sample at index 5: {dataset[5]}")

Output:

Dataset size: 100
Sample at index 5: 5

Using the DataLoader

Wrap the dataset in a DataLoader and iterate over it to get batches:

import torch
from torch.utils.data import Dataset, DataLoader


class NumberDataset(Dataset):
def __init__(self):
self.data = list(range(100))

def __len__(self):
return len(self.data)

def __getitem__(self, index):
return self.data[index]


dataset = NumberDataset()
dataloader = DataLoader(dataset, batch_size=10, shuffle=True)

# Print the first 3 batches
for i, batch in enumerate(dataloader):
if i >= 3:
break
print(f"Batch {i}: {batch}")

print(f"\nTotal batches: {len(dataloader)}")

Output (varies due to shuffling):

Batch 0: tensor([56, 84, 42,  4, 66, 27, 99, 18, 20, 89])
Batch 1: tensor([ 7, 30, 74, 57, 10, 6, 28, 77, 0, 50])
Batch 2: tensor([32, 22, 73, 97, 26, 98, 85, 17, 8, 16])

Total batches: 10

The DataLoader automatically divides the 100 samples into 10 batches of 10, shuffled randomly.

Dataset with Features and Labels

Most real-world datasets have both input features and target labels. Return them as a tuple from __getitem__():

import torch
from torch.utils.data import Dataset, DataLoader


class RegressionDataset(Dataset):
"""Simple dataset with input features and target values."""

def __init__(self, num_samples=200):
self.X = torch.randn(num_samples, 3) # 3 features
self.y = self.X.sum(dim=1) + torch.randn(num_samples) * 0.1 # Target with noise

def __len__(self):
return len(self.X)

def __getitem__(self, index):
return self.X[index], self.y[index]


dataset = RegressionDataset(200)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

# Get one batch
features, targets = next(iter(dataloader))
print(f"Features shape: {features.shape}")
print(f"Targets shape: {targets.shape}")

Output:

Features shape: torch.Size([32, 3])
Targets shape: torch.Size([32])

Each batch contains 32 samples with 3 features each, along with their corresponding target values.

Using DataLoader with Built-in Datasets

PyTorch and related libraries provide many ready-to-use datasets. Here's how to use a DataLoader with TensorDataset:

import torch
from torch.utils.data import DataLoader, TensorDataset

# Create tensors from data
features = torch.randn(150, 4) # 150 samples, 4 features
labels = torch.randint(0, 3, (150,)) # 3 classes

# Wrap in TensorDataset
dataset = TensorDataset(features, labels)

# Create DataLoader
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

for batch_features, batch_labels in dataloader:
print(f"Features: {batch_features.shape}, Labels: {batch_labels.shape}")
break # Just show the first batch

Output:

Features: torch.Size([16, 4]), Labels: torch.Size([16])
tip

TensorDataset is a convenient wrapper when your data is already in tensor form. It automatically pairs corresponding elements from multiple tensors when batching.

Using DataLoader in a Training Loop

Here's how DataLoaders are typically used in a model training loop:

import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset

# Create a simple dataset
X = torch.randn(500, 10)
y = torch.randint(0, 2, (500,)).float()
dataset = TensorDataset(X, y)

# Split into train and validation sets
train_size = int(0.8 * len(dataset))
val_size = len(dataset) - train_size
train_dataset, val_dataset = torch.utils.data.random_split(dataset, [train_size, val_size])

# Create DataLoaders
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

# Simple model
model = nn.Linear(10, 1)
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# Training loop
for epoch in range(3):
model.train()
total_loss = 0

for batch_X, batch_y in train_loader:
optimizer.zero_grad()
predictions = model(batch_X).squeeze()
loss = criterion(predictions, batch_y)
loss.backward()
optimizer.step()
total_loss += loss.item()

avg_loss = total_loss / len(train_loader)
print(f"Epoch {epoch + 1}, Average Loss: {avg_loss:.4f}")

Output:

Epoch 1, Average Loss: 0.7414
Epoch 2, Average Loss: 0.7155
Epoch 3, Average Loss: 0.7027

Key DataLoader Configuration Options

Shuffling

Always shuffle training data to prevent the model from learning the order of samples:

# Training: shuffle to randomize order each epoch
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

# Validation/Testing: no need to shuffle
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

Parallel Data Loading with num_workers

Speed up data loading by using multiple subprocesses:

dataloader = DataLoader(dataset, batch_size=32, num_workers=4)
warning

On Windows, multi-worker DataLoaders must be created inside an if __name__ == '__main__': block to avoid spawning errors. On macOS, you may also need to set the multiprocessing start method: torch.multiprocessing.set_start_method('fork').

Dropping the Last Incomplete Batch

If your dataset size isn't evenly divisible by the batch size, the last batch will be smaller. Use drop_last=True to discard it:

import torch
from torch.utils.data import Dataset, DataLoader


class NumberDataset(Dataset):
def __init__(self):
self.data = list(range(100))

def __len__(self):
return len(self.data)

def __getitem__(self, index):
return self.data[index]


dataset = NumberDataset() # 100 samples
dataloader = DataLoader(dataset, batch_size=32, drop_last=True)
print(f"Batches: {len(dataloader)}") # 3 batches of 32, last 4 samples dropped

Output:

Batches: 3

GPU Memory Optimization with pin_memory

When training on GPU, enable pin_memory for faster host-to-device transfers:

dataloader = DataLoader(dataset, batch_size=32, pin_memory=True)

for batch_X, batch_y in dataloader:
batch_X = batch_X.to('cuda', non_blocking=True)
batch_y = batch_y.to('cuda', non_blocking=True)

Common Mistake: Forgetting to Set shuffle=True for Training

Training without shuffling can cause the model to learn patterns from the data ordering rather than the data itself:

# WRONG: no shuffling during training, model may learn order patterns
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=False)

# CORRECT: always shuffle training data
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
info

Always set shuffle=True for training DataLoaders. For validation and test DataLoaders, shuffling is unnecessary since you're only evaluating, not learning from the data order.

Quick Reference

ConfigurationTrainingValidation/Testing
shuffleTrueFalse
batch_size16–128 (experiment)Same or larger
num_workers2–8 (depends on CPU)Same as training
drop_lastOptional (True for BatchNorm stability)False
pin_memoryTrue (if using GPU)True (if using GPU)

The DataLoader is the backbone of efficient data handling in PyTorch. By configuring batch size, shuffling, and parallel workers, you can significantly speed up training while keeping memory usage under control. Combined with a well-structured Dataset class, it provides a clean and scalable pipeline for feeding data to your models.