Python SQLAlchemy: How to Define Column and Data Types in SQLAlchemy in Python
Choosing the correct data types for your database columns is essential for maintaining data integrity and optimizing query performance. SQLAlchemy offers a wide array of types that map Python objects to SQL storage, and the modern SQLAlchemy 2.0 Declarative style makes these mappings safer and more expressive through Python type hints.
This guide covers the most common column types, explains how to use the modern Mapped and mapped_column syntax, and walks through practical patterns like nullable fields, server-side defaults, and common mistakes to avoid.
Common Core Types
SQLAlchemy provides a set of generic types that work across all major databases, including PostgreSQL, MySQL, and SQLite. These types automatically translate to the appropriate SQL type for your specific database engine:
| Type | Python Type | SQL Equivalent | Typical Use Case |
|---|---|---|---|
Integer | int | INTEGER | Primary keys, counters, quantities |
String(n) | str | VARCHAR(n) | Names, emails, short text with a length limit |
Text | str | TEXT | Descriptions, blog posts, unlimited-length text |
Numeric(p, s) | Decimal | NUMERIC(p, s) | Currency, precise calculations |
Float | float | FLOAT | Scientific measurements, approximate values |
Boolean | bool | BOOLEAN | Flags, toggles, true/false states |
DateTime | datetime | DATETIME / TIMESTAMP | Timestamps, scheduling |
Date | date | DATE | Birthdays, calendar dates |
LargeBinary | bytes | BLOB | Files, images, binary data |
Modern 2.0 Syntax: Mapped and mapped_column
SQLAlchemy 2.0 introduced a declarative style that uses Python type hints (Mapped[]) to define the relationship between Python types and database columns. This approach enables IDEs to provide better autocompletion, type checking, and error detection:
from datetime import datetime
from decimal import Decimal
from sqlalchemy import String, Numeric, DateTime, Text
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
class Product(Base):
__tablename__ = "products"
# Primary key: automatically indexed and non-nullable
id: Mapped[int] = mapped_column(primary_key=True)
# String with a maximum length of 100 characters
title: Mapped[str] = mapped_column(String(100))
# Unlimited-length text for descriptions
description: Mapped[str] = mapped_column(Text)
# Precise decimal: up to 10 digits total, 2 after the decimal point
price: Mapped[Decimal] = mapped_column(Numeric(10, 2))
# Timestamp set automatically when the record is created
created_at: Mapped[datetime] = mapped_column(
DateTime, default=datetime.utcnow
)
The Mapped[int] annotation tells both SQLAlchemy and your IDE what Python type to expect, while mapped_column(...) specifies the SQL-level details like the column type, constraints, and defaults.
Handling Nullable and Optional Columns
By default, columns defined with Mapped[str] are non-nullable. To allow NULL values, use Optional from Python's typing module:
from typing import Optional
from datetime import datetime
from sqlalchemy import String, DateTime
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
# Required: cannot be NULL
email: Mapped[str] = mapped_column(String(255))
# Optional: can be NULL in the database
phone: Mapped[Optional[str]] = mapped_column(String(20), default=None)
# Optional timestamp
last_login: Mapped[Optional[datetime]] = mapped_column(DateTime, default=None)
# This works: phone is optional
user = User(email="alice@example.com")
# This also works: phone is provided
user_with_phone = User(email="bob@example.com", phone="555-0123")
Mapped[str] makes the column NOT NULL at the database level. Mapped[Optional[str]] allows NULL. This is a significant improvement over the older style where nullability was controlled by a separate nullable=True parameter that was easy to forget.
Common Mistake: Using Float for Currency
One of the most frequent data type errors is using Float for monetary values. Floating-point arithmetic introduces tiny rounding errors that accumulate over time:
from sqlalchemy import Float, Numeric
from decimal import Decimal
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
# Wrong: Float causes rounding errors
class InvoiceBad(Base):
__tablename__ = "invoices_bad"
id: Mapped[int] = mapped_column(primary_key=True)
amount: Mapped[float] = mapped_column(Float)
# Demonstration of the problem
print(0.1 + 0.2) # 0.30000000000000004
Output:
0.30000000000000004
# Correct: Numeric/Decimal preserves exact values
class InvoiceGood(Base):
__tablename__ = "invoices_good"
id: Mapped[int] = mapped_column(primary_key=True)
amount: Mapped[Decimal] = mapped_column(Numeric(10, 2))
# Decimal arithmetic is exact
from decimal import Decimal
print(Decimal("0.1") + Decimal("0.2")) # 0.3
Output:
0.3
Never use Float for currency or financial data. Floating-point rounding errors like 0.30000000000000004 accumulate across thousands of transactions and break financial audits. Always use Numeric with an appropriate precision and scale, such as Numeric(10, 2) for values up to 99,999,999.99.
Server-Side Defaults vs Python Defaults
SQLAlchemy supports two kinds of default values, and the distinction matters in production environments:
default: Runs in your Python application before the data is sent to the database.server_default: Creates a default rule inside the SQL table definition itself, so the database generates the value.
from datetime import datetime
from sqlalchemy import DateTime, String, func
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
class AuditLog(Base):
__tablename__ = "audit_logs"
id: Mapped[int] = mapped_column(primary_key=True)
action: Mapped[str] = mapped_column(String(100))
# Python-side default: uses the app server's clock
created_by_python: Mapped[datetime] = mapped_column(
DateTime, default=datetime.utcnow
)
# Server-side default: uses the database server's clock
created_by_db: Mapped[datetime] = mapped_column(
DateTime, server_default=func.now()
)
server_default is preferred for timestamps in collaborative or distributed environments. If multiple application servers insert records, a Python-side default uses each server's local clock, which may differ slightly. A server_default ensures every timestamp comes from the same source: the database.
Adding Constraints and Indexes
Columns often need additional constraints beyond just their data type. You can add uniqueness, indexing, and check constraints directly in mapped_column:
from sqlalchemy import String, Integer
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
class Employee(Base):
__tablename__ = "employees"
id: Mapped[int] = mapped_column(primary_key=True)
# Unique constraint: no two employees can share an email
email: Mapped[str] = mapped_column(String(255), unique=True)
# Index for faster lookups by department
department: Mapped[str] = mapped_column(String(100), index=True)
# Non-nullable with a default value
status: Mapped[str] = mapped_column(String(20), default="active")
# Integer with a specific name in the database
employee_number: Mapped[int] = mapped_column(
Integer, unique=True, name="emp_no"
)
Complete Practical Example
Here is a realistic model that combines multiple type choices, defaults, and constraints:
from datetime import datetime
from decimal import Decimal
from typing import Optional
from sqlalchemy import String, Text, Numeric, DateTime, Boolean, func
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
class Order(Base):
__tablename__ = "orders"
id: Mapped[int] = mapped_column(primary_key=True)
order_number: Mapped[str] = mapped_column(String(50), unique=True, index=True)
customer_email: Mapped[str] = mapped_column(String(255))
total: Mapped[Decimal] = mapped_column(Numeric(12, 2))
notes: Mapped[Optional[str]] = mapped_column(Text, default=None)
is_paid: Mapped[bool] = mapped_column(Boolean, default=False)
created_at: Mapped[datetime] = mapped_column(
DateTime, server_default=func.now()
)
updated_at: Mapped[Optional[datetime]] = mapped_column(DateTime, default=None)
# Creating an order
order = Order(
order_number="ORD-2024-001",
customer_email="alice@example.com",
total=Decimal("149.99"),
notes="Rush delivery requested"
)
Quick Reference
| Goal | Type Choice | Example |
|---|---|---|
| Auto-incrementing ID | Integer + primary_key=True | mapped_column(primary_key=True) |
| Short text with limit | String(n) | mapped_column(String(255)) |
| Long text, no limit | Text | mapped_column(Text) |
| Money and financials | Numeric(p, s) | mapped_column(Numeric(10, 2)) |
| True/false flag | Boolean | mapped_column(Boolean, default=False) |
| Timestamp (DB clock) | DateTime + server_default | mapped_column(DateTime, server_default=func.now()) |
| Optional/nullable field | Optional[type] | Mapped[Optional[str]] |
| Binary file storage | LargeBinary | mapped_column(LargeBinary) |
Summary
Defining column types correctly in SQLAlchemy prevents data corruption, improves query performance, and catches bugs early through type checking.
- Use the
Mappedandmapped_columnsyntax from SQLAlchemy 2.0 for type-safe models that work well with modern IDEs. - Always use
Numericinstead ofFloatfor currency. - Prefer
server_defaultoverdefaultfor timestamps in multi-server environments. - Use
Optionalto explicitly mark columns that acceptNULL, and adduniqueandindexconstraints directly in your column definitions to enforce data integrity at the database level.