Python Polars: How to Map a Python Dictionary to a Polars Series
Polars is a high-performance DataFrame library built in Rust that has become a popular alternative to Pandas for data manipulation in Python. A common task when working with data is mapping values in a column based on a predefined dictionary - for example, converting role codes to department names, mapping country abbreviations to full names, or translating category IDs to labels.
In this guide, you will learn multiple ways to map a Python dictionary to a Polars Series, understand the performance differences between each approach, and discover best practices for handling missing keys.
Setting Up
First, install Polars if you haven't already:
pip install polars
Then create a sample DataFrame to work with:
import polars as pl
df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})
print(df)
Output:
shape: (4, 2)
┌─────────┬───────────┐
│ name ┆ role │
│ --- ┆ --- │
│ str ┆ str │
╞═════════╪═══════════╡
│ Alice ┆ Manager │
│ Bob ┆ Developer │
│ Charlie ┆ Designer │
│ David ┆ Developer │
└─────────┴───────────┘
And define a mapping dictionary:
role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}
Method 1: Using replace_strict() - The Recommended Approach
The most idiomatic and performant way to map dictionary values in Polars is using the replace_strict() expression (or replace() for flexible matching). This runs entirely within Polars' native Rust engine:
import polars as pl
df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})
role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}
# Map using replace_strict
df = df.with_columns(
pl.col("role")
.replace_strict(role_to_department)
.alias("department")
)
print(df)
Output:
shape: (4, 3)
┌─────────┬───────────┬────────────────┐
│ name ┆ role ┆ department │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪═══════════╪════════════════╡
│ Alice ┆ Manager ┆ Administration │
│ Bob ┆ Developer ┆ Engineering │
│ Charlie ┆ Designer ┆ Creative │
│ David ┆ Developer ┆ Engineering │
└─────────┴───────────┴────────────────┘
replace_strict() is preferred- Performance: It runs natively in Rust, avoiding Python's interpreter overhead.
- Lazy compatibility: It works seamlessly with Polars' lazy evaluation engine for optimized query execution.
- Explicit error handling:
replace_strict()raises an error if a value has no mapping, preventing silent bugs.
Handling Missing Keys With a Default Value
If your column contains values not present in the dictionary, replace_strict() will raise an error by default. Use the default parameter to handle unmapped values:
import polars as pl
df = pl.DataFrame({
"role": ["Manager", "Developer", "Intern"] # "Intern" not in dictionary
})
role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
}
# Provide a default for unmapped values
df = df.with_columns(
pl.col("role")
.replace_strict(role_to_department, default="Unknown")
.alias("department")
)
print(df)
Output:
shape: (3, 2)
┌───────────┬────────────────┐
│ role ┆ department │
│ --- ┆ --- │
│ str ┆ str │
╞═══════════╪════════════════╡
│ Manager ┆ Administration │
│ Developer ┆ Engineering │
│ Intern ┆ Unknown │
└───────────┴────────────────┘
Method 2: Using replace() With Flexible Matching
The replace() method (without strict) leaves unmapped values unchanged instead of raising an error:
import polars as pl
df = pl.DataFrame({
"role": ["Manager", "Developer", "Intern"]
})
role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
}
df = df.with_columns(
pl.col("role")
.replace(role_to_department)
.alias("department")
)
print(df)
Output:
shape: (3, 2)
┌───────────┬────────────────┐
│ role ┆ department │
│ --- ┆ --- │
│ str ┆ str │
╞═══════════╪════════════════╡
│ Manager ┆ Administration │
│ Developer ┆ Engineering │
│ Intern ┆ Intern │
└───────────┴────────────────┘
Notice that "Intern" remains unchanged because it has no mapping. This is useful when you only want to remap a subset of values.
Method 3: Using map_elements() With a Lambda
For complex mapping logic that goes beyond simple dictionary lookups, use map_elements() (formerly apply()):
import polars as pl
df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})
role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}
# Map using map_elements with a lambda
df = df.with_columns(
pl.col("role")
.map_elements(lambda role: role_to_department.get(role, "Unknown"), return_dtype=pl.Utf8)
.alias("department")
)
print(df)
Output:
shape: (4, 3)
┌─────────┬───────────┬────────────────┐
│ name ┆ role ┆ department │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪═══════════╪════════════════╡
│ Alice ┆ Manager ┆ Administration │
│ Bob ┆ Developer ┆ Engineering │
│ Charlie ┆ Designer ┆ Creative │
│ David ┆ Developer ┆ Engineering │
└─────────┴───────────┴────────────────┘
map_elements()map_elements() executes a Python function for every element in the Series, which bypasses Polars' optimized Rust engine. This makes it significantly slower than replace() or replace_strict(), especially on large datasets:
# ❌ Slow: Python lambda called per element
pl.col("role").map_elements(lambda x: mapping.get(x), return_dtype=pl.Utf8)
# ✅ Fast: native Polars operation
pl.col("role").replace_strict(mapping, default="Unknown")
Use map_elements() only when your logic is too complex for replace().
Method 4: Using a Join for Large Mappings
When your mapping dictionary is very large, converting it to a DataFrame and performing a join can be more efficient and memory-friendly:
import polars as pl
df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})
# Convert dictionary to a mapping DataFrame
mapping_df = pl.DataFrame({
"role": ["Manager", "Developer", "Designer"],
"department": ["Administration", "Engineering", "Creative"]
})
# Join the mapping DataFrame
result = df.join(mapping_df, on="role", how="left")
print(result)
Output:
shape: (4, 3)
┌─────────┬───────────┬────────────────┐
│ name ┆ role ┆ department │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪═══════════╪════════════════╡
│ Alice ┆ Manager ┆ Administration │
│ Bob ┆ Developer ┆ Engineering │
│ Charlie ┆ Designer ┆ Creative │
│ David ┆ Developer ┆ Engineering │
└─────────┴───────────┴────────────────┘
Using how="left" ensures that rows with unmapped values still appear in the result (with null in the department column).
Extracting the Mapped Column as a Standalone Series
If you need just the mapped Series (not added to the DataFrame):
import polars as pl
roles = pl.Series("role", ["Manager", "Developer", "Designer", "Developer"])
role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}
department_series = roles.replace_strict(role_to_department).alias("department")
print(department_series)
Output:
shape: (4,)
Series: 'department' [str]
[
"Administration"
"Engineering"
"Creative"
"Engineering"
]
Comparison of Methods
| Method | Performance | Missing Key Handling | Best For |
|---|---|---|---|
replace_strict() | Fastest (native Rust) | Error by default, default param available | Most mapping tasks |
replace() | Fastest (native Rust) | Leaves unmapped values unchanged | Partial remapping |
map_elements() + lambda | Slow (Python per-element) | Use dict.get(key, default) | Complex custom logic |
| Join with mapping DataFrame | Fast for large mappings | null for unmatched (left join) | Very large lookup tables |
Conclusion
Mapping a Python dictionary to a Polars Series is best accomplished using Polars' native replace_strict() or replace() expressions, which run in Rust and are significantly faster than Python-based alternatives like map_elements().
- Use
replace_strict()when you want strict validation and a default for missing keys,replace()when unmapped values should remain unchanged, and a join when working with very large lookup tables. - Reserve
map_elements()with lambda functions only for complex transformation logic that cannot be expressed with the built-in methods.