Skip to main content

How to Parse a YAML File in Python

YAML (YAML Ain't Markup Language) is a human-readable data serialization format commonly used for configuration files, deployment manifests, CI/CD pipelines, and data exchange. Its clean, indentation-based syntax makes it more readable than JSON or XML for many use cases.

Python's PyYAML library provides a straightforward way to read (parse), write, and manipulate YAML files. In this guide, you will learn how to install PyYAML, parse single and multi-document YAML files, access specific values, and handle YAML data safely.

Installing PyYAML

Install the library using pip:

pip install pyyaml

Then import it in your Python script:

import yaml

Sample YAML Files

We will use the following YAML files throughout this guide.

config.yml: A single-document YAML file:

config.yml
UserName: TutorialReference
Password: Password@123
Phone: 1234567890
Website: tutorialreference.com
Skills:
- Python
- SQL
- Django
- JavaScript

multi_docs.yml: A multi-document YAML file (documents separated by ---):

multi_docs.yml
---
UserName: TutorialReference
Password: Password@123
Website: tutorialreference.com
...
---
UserName: Google
Password: google@123
Website: google.com
...
---
UserName: Yahoo
Password: yahoo@123
Website: yahoo.com
...

Parsing a Single YAML Document

safe_load() parses a YAML document and returns a Python dictionary. It is the safest loading function because it only constructs basic Python objects (strings, numbers, lists, dicts) and prevents arbitrary code execution:

import yaml

with open('config.yml', 'r') as f:
data = yaml.safe_load(f)

print(data)
print(type(data))

Output:

{'UserName': 'TutorialReference', 'Password': 'Password@123', 'Phone': 1234567890,
'Website': 'tutorialreference.com', 'Skills': ['Python', 'SQL', 'Django', 'JavaScript']}
<class 'dict'>
note

YAML keys become dictionary keys, YAML lists become Python lists, and YAML values are automatically converted to their appropriate Python types (strings, integers, booleans, etc.).

Accessing Specific Values

Once parsed, you work with the data as a standard Python dictionary:

import yaml

with open('config.yml', 'r') as f:
data = yaml.safe_load(f)

print(f"Username: {data['UserName']}")
print(f"Website: {data['Website']}")
print(f"Skills: {', '.join(data['Skills'])}")

Output:

Username: TutorialReference
Website: tutorialreference.com
Skills: Python, SQL, Django, JavaScript

Use .get() for safe access when a key might not exist:

email = data.get('Email', 'Not specified')
print(f"Email: {email}")

Output:

Email: Not specified

Parsing Multi-Document YAML Files

Some YAML files contain multiple documents separated by ---. Use yaml.safe_load_all() to parse all documents:

import yaml

with open('multi_docs.yml', 'r') as f:
documents = list(yaml.safe_load_all(f))

for i, doc in enumerate(documents):
print(f"Document {i + 1}: {doc['UserName']} : {doc['Website']}")

Output:

Document 1: TutorialReference : tutorialreference.com
Document 2: Google : google.com
Document 3: Yahoo : yahoo.com

safe_load_all() returns a generator, so wrap it in list() if you need to access documents multiple times. Alternatively, iterate directly:

with open('multi_docs.yml', 'r') as f:
for doc in yaml.safe_load_all(f):
print(doc['UserName'])

Parsing a YAML String

You can also parse YAML directly from a string:

import yaml

yaml_string = """
database:
host: localhost
port: 5432
name: mydb
credentials:
user: admin
password: secret123
"""

data = yaml.safe_load(yaml_string)

print(f"Host: {data['database']['host']}")
print(f"Port: {data['database']['port']}")
print(f"User: {data['database']['credentials']['user']}")

Output:

Host: localhost
Port: 5432
User: admin

Understanding the Different Load Functions

PyYAML provides several loading functions. Here is when to use each:

FunctionSafetyUse Case
yaml.safe_load()Safe: only basic Python typesRecommended for all standard use cases
yaml.safe_load_all()SafeMulti-document files, safe
yaml.full_load()⚠️ Moderate: loads most Python typesWhen you need Python-specific types
yaml.load(f, Loader=SafeLoader)✅ Safe (with SafeLoader)Explicit loader specification
yaml.load(f, Loader=FullLoader)⚠️ ModerateExplicit full loading
yaml.unsafe_load()Unsafe: can execute arbitrary codeNever use with untrusted input
Never use yaml.load() without a Loader

Calling yaml.load() without specifying a Loader parameter is deprecated and dangerous. It defaults to FullLoader in modern versions but used to allow arbitrary code execution:

# ❌ Deprecated and potentially unsafe
data = yaml.load(f)

# ✅ Safe: always specify a Loader
data = yaml.load(f, Loader=yaml.SafeLoader)

# ✅ Even better: use safe_load() directly
data = yaml.safe_load(f)

Writing Python Data to a YAML File

To convert a Python dictionary to YAML and write it to a file, use yaml.dump():

import yaml

data = {
'database': {
'host': 'localhost',
'port': 5432,
'name': 'production_db'
},
'features': ['auth', 'logging', 'caching'],
'debug': False
}

with open('output.yml', 'w') as f:
yaml.dump(data, f, default_flow_style=False, sort_keys=False)

print("YAML file written successfully.")

# Verify by reading it back
with open('output.yml', 'r') as f:
print(f.read())

Output:

YAML file written successfully.
database:
host: localhost
port: 5432
name: production_db
features:
- auth
- logging
- caching
debug: false
ParameterDescription
default_flow_style=FalseUse block style (indented) instead of inline {key: value}
sort_keys=FalsePreserve the original key order
allow_unicode=TrueAllow Unicode characters in the output

Handling Common YAML Data Types

YAML automatically converts values to appropriate Python types:

# YAML file
string_value: "hello"
integer_value: 42
float_value: 3.14
boolean_true: true
boolean_false: false
null_value: null
date_value: 2024-01-15
list_value:
- item1
- item2
nested:
key1: value1
key2: value2
import yaml

yaml_string = """
string_value: "hello"
integer_value: 42
float_value: 3.14
boolean_true: true
boolean_false: false
null_value: null
date_value: 2024-01-15
"""

data = yaml.safe_load(yaml_string)

for key, value in data.items():
print(f"{key}: {value!r} ({type(value).__name__})")

Output:

string_value: 'hello' (str)
integer_value: 42 (int)
float_value: 3.14 (float)
boolean_true: True (bool)
boolean_false: False (bool)
null_value: None (NoneType)
date_value: datetime.date(2024, 1, 15) (date)
YAML boolean gotchas

YAML interprets many strings as booleans: yes, no, on, off, true, false are all converted to True or False. This can cause unexpected behavior:

# ❌ These are parsed as booleans, not strings
country: no # Parsed as False
answer: yes # Parsed as True
switch: on # Parsed as True

# ✅ Quote strings that could be misinterpreted
country: "no"
answer: "yes"
switch: "on"

Complete Example: Parsing a Configuration File

import yaml
import sys

def load_config(filepath):
"""Load and validate a YAML configuration file."""
try:
with open(filepath, 'r') as f:
config = yaml.safe_load(f)

if config is None:
print(f"Warning: {filepath} is empty.")
return {}

return config

except FileNotFoundError:
print(f"Error: File '{filepath}' not found.")
sys.exit(1)
except yaml.YAMLError as e:
print(f"Error parsing YAML file: {e}")
sys.exit(1)


# Usage
config = load_config('config.yml')

print(f"User: {config.get('UserName', 'Unknown')}")
print(f"Site: {config.get('Website', 'N/A')}")

skills = config.get('Skills', [])
print(f"Skills ({len(skills)}): {', '.join(skills)}")

Output:

User: TutorialReference
Site: tutorialreference.com
Skills (4): Python, SQL, Django, JavaScript

Conclusion

Parsing YAML files in Python is straightforward with the PyYAML library. Always use yaml.safe_load() for single documents and yaml.safe_load_all() for multi-document files: these functions prevent arbitrary code execution and safely convert YAML data into Python dictionaries, lists, and primitive types. Use yaml.dump() to write Python data back to YAML format. Remember to handle FileNotFoundError and yaml.YAMLError exceptions for robust file processing, and quote YAML string values that could be misinterpreted as booleans or other types.