Skip to main content

What data structure should I use in Python? Dataclass vs NamedTuple vs Class

Choosing the right data structure for storing related values reduces boilerplate code and improves clarity. Python offers several options, each with distinct characteristics.

Quick Comparison

FeatureRegular ClassNamedTupleDataclass
MutabilityMutableImmutableMutable (configurable)
BoilerplateHighLowLow
Default valuesManualLimitedEasy
Type hintsOptionalRequiredRequired
InheritanceFull supportLimitedFull support
HashableNo (by default)YesConfigurable
MemoryHigherLowerMedium

Regular Class

Traditional classes require manual implementation of common methods:

class User:
def __init__(self, id: int, name: str, email: str = ""):
self.id = id
self.name = name
self.email = email

def __repr__(self):
return f"User(id={self.id}, name={self.name}, email={self.email})"

def __eq__(self, other):
if not isinstance(other, User):
return False
return self.id == other.id and self.name == other.name

user = User(1, "Alice", "alice@example.com")
print(user) # User(id=1, name='Alice', email='alice@example.com')
note

Best for objects with significant behavior beyond data storage.

NamedTuple

Immutable, memory-efficient records with tuple compatibility:

from typing import NamedTuple

class User(NamedTuple):
id: int
name: str
email: str = ""

user = User(1, "Alice", "alice@example.com")

# Tuple-like access
print(user[0]) # 1
print(user.name) # Alice

# Unpacking works
id, name, email = user

# Immutable - this raises an error
# user.name = "Bob" # AttributeError
tip

NamedTuples are hashable by default, making them suitable as dictionary keys or set members.

Dataclass

The modern standard for data containers with sensible defaults:

from dataclasses import dataclass

@dataclass
class User:
id: int
name: str
email: str = ""

user = User(1, "Alice")
user.email = "alice@example.com" # Mutable by default

print(user) # User(id=1, name='Alice', email='alice@example.com')

Automatic generation of __init__, __repr__, and __eq__ methods.

Dataclass Configuration Options

Customize behavior with decorator parameters:

from dataclasses import dataclass, field

@dataclass(frozen=True) # Makes it immutable
class Point:
x: float
y: float

@dataclass(order=True) # Adds comparison methods
class Priority:
level: int
task: str

@dataclass
class Config:
name: str
values: list = field(default_factory=list) # Mutable default

def __post_init__(self):
# Called after __init__
self.name = self.name.upper()

Available options:

  • frozen=True: Makes instances immutable and hashable
  • order=True: Generates __lt__, __le__, __gt__, __ge__
  • slots=True (Python 3.10+): Uses __slots__ for memory efficiency

Default Values with Mutable Types

Both NamedTuple and dataclass handle mutable defaults differently:

from dataclasses import dataclass, field
from typing import NamedTuple

# Dataclass - use field() for mutable defaults
@dataclass
class Team:
name: str
members: list = field(default_factory=list)

# NamedTuple - no direct mutable defaults
class Team(NamedTuple):
name: str
members: tuple = () # Use immutable type instead
warning

Never use mutable default values like members: list = [] in dataclasses. Each instance would share the same list object. Always use field(default_factory=list).

Inheritance

Dataclasses support inheritance naturally:

from dataclasses import dataclass

@dataclass
class Person:
name: str
age: int

@dataclass
class Employee(Person):
employee_id: str
department: str = "General"

emp = Employee("Alice", 30, "E001", "Engineering")
print(emp) # Employee(name='Alice', age=30, employee_id='E001', department='Engineering')
note

NamedTuple inheritance is more limited and less intuitive.

Performance Comparison

from dataclasses import dataclass
from typing import NamedTuple
import sys

class UserClass:
def __init__(self, id, name):
self.id = id
self.name = name

class UserTuple(NamedTuple):
id: int
name: str

@dataclass
class UserData:
id: int
name: str

@dataclass(slots=True)
class UserSlots:
id: int
name: str

# Memory comparison
instances = [
UserClass(1, "A"),
UserTuple(1, "A"),
UserData(1, "A"),
UserSlots(1, "A")
]

for obj in instances:
print(f"{type(obj).__name__}: {sys.getsizeof(obj)} bytes")

Output:

UserClass: 56 bytes
UserTuple: 56 bytes
UserData: 56 bytes
UserSlots: 48 bytes
note

NamedTuple typically uses the least memory, followed by dataclass with slots.

Conversion Between Types

from dataclasses import dataclass, asdict, astuple
from typing import NamedTuple

@dataclass
class User:
id: int
name: str

user = User(1, "Alice")

# To dictionary
user_dict = asdict(user) # {'id': 1, 'name': 'Alice'}

# To tuple
user_tuple = astuple(user) # (1, 'Alice')

# From dictionary
user_from_dict = User(**user_dict)

When to Use Each

Use Regular Class When:

  • Object has significant behavior (methods)
  • Complex initialization logic required
  • Need full control over all aspects

Use NamedTuple When:

  • Immutability is required
  • Need tuple unpacking or indexing
  • Memory efficiency is critical
  • Using as dictionary keys

Use Dataclass When:

  • Primary purpose is storing data
  • Need mutable instances (default) or immutable (frozen=True)
  • Want automatic method generation
  • Need default values and type hints
# NamedTuple: Coordinates that shouldn't change
class Point(NamedTuple):
x: float
y: float

# Dataclass: Configurable settings
@dataclass
class Settings:
theme: str = "dark"
font_size: int = 12

# Regular class: Complex behavior
class DatabaseConnection:
def __init__(self, host, port):
self.host = host
self.port = port
self._connection = None

def connect(self):
# Connection logic
pass
note

Python 3.10+ dataclasses with slots=True approach NamedTuple memory efficiency while retaining mutability and easier inheritance.

Default to dataclass for most data-holding needs. Choose NamedTuple when immutability and tuple compatibility are essential. Reserve regular classes for objects with complex behavior.