Skip to main content

Python PyYaml: How to Use Custom Tags in PyYAML in Python

YAML's flexibility extends beyond basic data types through custom tags, which enable direct mapping between YAML content and Python objects. Instead of parsing generic dictionaries and manually instantiating classes, custom tags let PyYAML automatically construct specialized objects like configuration classes, data models, or domain-specific types. This guide demonstrates how to implement bidirectional serialization for seamless YAML integration.

Understanding Custom Tags

Custom tags tell PyYAML how to interpret specific YAML nodes. A tag like !server signals that the following data should become a ServerConfig object rather than a plain dictionary.

Implementing a Complete Custom Tag

Register both a representer (Python → YAML) and constructor (YAML → Python) for full round-trip support:

import yaml

class ServerConfig:
"""Configuration class for server settings."""

def __init__(self, host, port, ssl=False):
self.host = host
self.port = port
self.ssl = ssl

def __repr__(self):
return f"ServerConfig(host={self.host}, port={self.port}, ssl={self.ssl})"


def server_representer(dumper, data):
"""Convert ServerConfig object to YAML representation."""
return dumper.represent_mapping('!server', {
'host': data.host,
'port': data.port,
'ssl': data.ssl
})


def server_constructor(loader, node):
"""Construct ServerConfig object from YAML node."""
values = loader.construct_mapping(node)
return ServerConfig(**values)


# Register the custom tag handlers
yaml.add_representer(ServerConfig, server_representer)
yaml.add_constructor('!server', server_constructor)

# Loading from YAML
yaml_content = """
!server
host: api.example.com
port: 443
ssl: true
"""

config = yaml.load(yaml_content, Loader=yaml.FullLoader)
print(config) # ServerConfig(host=api.example.com, port=443, ssl=True)
print(f"Connecting to {config.host}:{config.port}")

# Dumping back to YAML
output = yaml.dump(config)
print(output)

Output:

ServerConfig(host=api.example.com, port=443, ssl=True)
Connecting to api.example.com:443
!server
host: api.example.com
port: 443
ssl: true
Why Use Custom Tags?

Tags provide type safety and encapsulation. Your application receives fully-formed objects with methods and validation rather than raw dictionaries requiring manual processing.

Multiple Custom Tags

Register multiple tags for different domain objects:

import yaml

class DatabaseConfig:
def __init__(self, engine, name, host="localhost"):
self.engine = engine
self.name = name
self.host = host

class CacheConfig:
def __init__(self, backend, ttl=3600):
self.backend = backend
self.ttl = ttl


# Representer and constructor for DatabaseConfig
def db_representer(dumper, data):
return dumper.represent_mapping('!database', {
'engine': data.engine, 'name': data.name, 'host': data.host
})

def db_constructor(loader, node):
return DatabaseConfig(**loader.construct_mapping(node))


# Representer and constructor for CacheConfig
def cache_representer(dumper, data):
return dumper.represent_mapping('!cache', {
'backend': data.backend, 'ttl': data.ttl
})

def cache_constructor(loader, node):
return CacheConfig(**loader.construct_mapping(node))


# Register all tags
yaml.add_representer(DatabaseConfig, db_representer)
yaml.add_constructor('!database', db_constructor)
yaml.add_representer(CacheConfig, cache_representer)
yaml.add_constructor('!cache', cache_constructor)

# Complex configuration file
config_yaml = """
database: !database
engine: postgresql
name: app_production
host: db.example.com

cache: !cache
backend: redis
ttl: 7200
"""

config = yaml.load(config_yaml, Loader=yaml.FullLoader)
print(f"Database: {config['database'].engine}")
print(f"Cache TTL: {config['cache'].ttl}")

Output:

Database: postgresql
Cache TTL: 7200

Implicit Resolvers for Pattern-Based Detection

Automatically detect types based on patterns without explicit tags:

import yaml
import re
from datetime import date

# Automatically recognize ISO dates
yaml.add_implicit_resolver(
'!isodate',
re.compile(r'^\d{4}-\d{2}-\d{2}$'),
first=list('0123456789')
)

def isodate_constructor(loader, node):
value = loader.construct_scalar(node)
year, month, day = map(int, value.split('-'))
return date(year, month, day)

yaml.add_constructor('!isodate', isodate_constructor)

# Dates are automatically parsed without explicit tags
data = yaml.load("""
event: Conference
start_date: 2024-06-15
end_date: 2024-06-17
""", Loader=yaml.FullLoader)

print(f"Type: {type(data['start_date'])}") # <class 'datetime.date'>
print(f"Event starts: {data['start_date']}")

Output:

Type: <class 'datetime.date'>
Event starts: 2024-06-15
Pattern Matching

The first parameter in add_implicit_resolver specifies which starting characters trigger pattern checking, improving parsing performance.

Scalar Value Tags

For simple transformations, use scalar constructors:

import yaml
import os

# Environment variable substitution
def env_constructor(loader, node):
"""Replace !env tag with environment variable value."""
var_name = loader.construct_scalar(node)
return os.environ.get(var_name, f"<{var_name} not set>")

yaml.add_constructor('!env', env_constructor)

# Configuration with environment references
config_yaml = """
database:
password: !env DATABASE_PASSWORD
host: !env DATABASE_HOST
"""

os.environ['DATABASE_PASSWORD'] = 'secret123'
os.environ['DATABASE_HOST'] = 'localhost'

config = yaml.load(config_yaml, Loader=yaml.FullLoader)
print(config['database']['password']) # secret123

Output:

secret123

Safe Custom Loaders

For production use, create a custom safe loader:

import yaml

class SafeConfigLoader(yaml.SafeLoader):
"""Custom loader with application-specific tags."""
pass

def server_constructor(loader, node):
values = loader.construct_mapping(node)
return ServerConfig(**values)

# Add constructor to custom loader only
SafeConfigLoader.add_constructor('!server', server_constructor)

# Use the safe custom loader
config = yaml.load(yaml_content, Loader=SafeConfigLoader)

Method Reference

MethodPurposeDirection
add_representer()Convert Python object → YAMLDump
add_constructor()Convert YAML → Python objectLoad
add_implicit_resolver()Auto-detect tags via regexLoad
add_multi_representer()Handle class hierarchiesDump
Security Warning

Never use yaml.load() without specifying a Loader. The default behavior in older PyYAML versions can execute arbitrary code. Always use:

  • yaml.SafeLoader for untrusted input
  • yaml.FullLoader for trusted application configs
  • Custom safe loaders for production systems

By implementing custom tags, you create expressive, type-safe configuration systems that bridge YAML's human-readable format with Python's object-oriented design patterns.