Skip to main content

Python Polars: How to Map a Python Dictionary to a Polars Series

Polars is a high-performance DataFrame library built in Rust that has become a popular alternative to Pandas for data manipulation in Python. A common task when working with data is mapping values in a column based on a predefined dictionary - for example, converting role codes to department names, mapping country abbreviations to full names, or translating category IDs to labels.

In this guide, you will learn multiple ways to map a Python dictionary to a Polars Series, understand the performance differences between each approach, and discover best practices for handling missing keys.

Setting Up

First, install Polars if you haven't already:

pip install polars

Then create a sample DataFrame to work with:

import polars as pl

df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})

print(df)

Output:

shape: (4, 2)
┌─────────┬───────────┐
│ name ┆ role │
│ --- ┆ --- │
│ str ┆ str │
╞═════════╪═══════════╡
│ Alice ┆ Manager │
│ Bob ┆ Developer │
│ Charlie ┆ Designer │
│ David ┆ Developer │
└─────────┴───────────┘

And define a mapping dictionary:

role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}

The most idiomatic and performant way to map dictionary values in Polars is using the replace_strict() expression (or replace() for flexible matching). This runs entirely within Polars' native Rust engine:

import polars as pl

df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})

role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}

# Map using replace_strict
df = df.with_columns(
pl.col("role")
.replace_strict(role_to_department)
.alias("department")
)

print(df)

Output:

shape: (4, 3)
┌─────────┬───────────┬────────────────┐
│ name ┆ role ┆ department │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪═══════════╪════════════════╡
│ Alice ┆ Manager ┆ Administration │
│ Bob ┆ Developer ┆ Engineering │
│ Charlie ┆ Designer ┆ Creative │
│ David ┆ Developer ┆ Engineering │
└─────────┴───────────┴────────────────┘
Why replace_strict() is preferred
  • Performance: It runs natively in Rust, avoiding Python's interpreter overhead.
  • Lazy compatibility: It works seamlessly with Polars' lazy evaluation engine for optimized query execution.
  • Explicit error handling: replace_strict() raises an error if a value has no mapping, preventing silent bugs.

Handling Missing Keys With a Default Value

If your column contains values not present in the dictionary, replace_strict() will raise an error by default. Use the default parameter to handle unmapped values:

import polars as pl

df = pl.DataFrame({
"role": ["Manager", "Developer", "Intern"] # "Intern" not in dictionary
})

role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
}

# Provide a default for unmapped values
df = df.with_columns(
pl.col("role")
.replace_strict(role_to_department, default="Unknown")
.alias("department")
)

print(df)

Output:

shape: (3, 2)
┌───────────┬────────────────┐
│ role ┆ department │
│ --- ┆ --- │
│ str ┆ str │
╞═══════════╪════════════════╡
│ Manager ┆ Administration │
│ Developer ┆ Engineering │
│ Intern ┆ Unknown │
└───────────┴────────────────┘

Method 2: Using replace() With Flexible Matching

The replace() method (without strict) leaves unmapped values unchanged instead of raising an error:

import polars as pl

df = pl.DataFrame({
"role": ["Manager", "Developer", "Intern"]
})

role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
}

df = df.with_columns(
pl.col("role")
.replace(role_to_department)
.alias("department")
)

print(df)

Output:

shape: (3, 2)
┌───────────┬────────────────┐
│ role ┆ department │
│ --- ┆ --- │
│ str ┆ str │
╞═══════════╪════════════════╡
│ Manager ┆ Administration │
│ Developer ┆ Engineering │
│ Intern ┆ Intern │
└───────────┴────────────────┘

Notice that "Intern" remains unchanged because it has no mapping. This is useful when you only want to remap a subset of values.

Method 3: Using map_elements() With a Lambda

For complex mapping logic that goes beyond simple dictionary lookups, use map_elements() (formerly apply()):

import polars as pl

df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})

role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}

# Map using map_elements with a lambda
df = df.with_columns(
pl.col("role")
.map_elements(lambda role: role_to_department.get(role, "Unknown"), return_dtype=pl.Utf8)
.alias("department")
)

print(df)

Output:

shape: (4, 3)
┌─────────┬───────────┬────────────────┐
│ name ┆ role ┆ department │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪═══════════╪════════════════╡
│ Alice ┆ Manager ┆ Administration │
│ Bob ┆ Developer ┆ Engineering │
│ Charlie ┆ Designer ┆ Creative │
│ David ┆ Developer ┆ Engineering │
└─────────┴───────────┴────────────────┘
Performance considerations with map_elements()

map_elements() executes a Python function for every element in the Series, which bypasses Polars' optimized Rust engine. This makes it significantly slower than replace() or replace_strict(), especially on large datasets:

# ❌ Slow: Python lambda called per element
pl.col("role").map_elements(lambda x: mapping.get(x), return_dtype=pl.Utf8)

# ✅ Fast: native Polars operation
pl.col("role").replace_strict(mapping, default="Unknown")

Use map_elements() only when your logic is too complex for replace().

Method 4: Using a Join for Large Mappings

When your mapping dictionary is very large, converting it to a DataFrame and performing a join can be more efficient and memory-friendly:

import polars as pl

df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie", "David"],
"role": ["Manager", "Developer", "Designer", "Developer"]
})

# Convert dictionary to a mapping DataFrame
mapping_df = pl.DataFrame({
"role": ["Manager", "Developer", "Designer"],
"department": ["Administration", "Engineering", "Creative"]
})

# Join the mapping DataFrame
result = df.join(mapping_df, on="role", how="left")

print(result)

Output:

shape: (4, 3)
┌─────────┬───────────┬────────────────┐
│ name ┆ role ┆ department │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═════════╪═══════════╪════════════════╡
│ Alice ┆ Manager ┆ Administration │
│ Bob ┆ Developer ┆ Engineering │
│ Charlie ┆ Designer ┆ Creative │
│ David ┆ Developer ┆ Engineering │
└─────────┴───────────┴────────────────┘

Using how="left" ensures that rows with unmapped values still appear in the result (with null in the department column).

Extracting the Mapped Column as a Standalone Series

If you need just the mapped Series (not added to the DataFrame):

import polars as pl

roles = pl.Series("role", ["Manager", "Developer", "Designer", "Developer"])

role_to_department = {
"Manager": "Administration",
"Developer": "Engineering",
"Designer": "Creative"
}

department_series = roles.replace_strict(role_to_department).alias("department")

print(department_series)

Output:

shape: (4,)
Series: 'department' [str]
[
"Administration"
"Engineering"
"Creative"
"Engineering"
]

Comparison of Methods

MethodPerformanceMissing Key HandlingBest For
replace_strict()Fastest (native Rust)Error by default, default param availableMost mapping tasks
replace()Fastest (native Rust)Leaves unmapped values unchangedPartial remapping
map_elements() + lambdaSlow (Python per-element)Use dict.get(key, default)Complex custom logic
Join with mapping DataFrameFast for large mappingsnull for unmatched (left join)Very large lookup tables

Conclusion

Mapping a Python dictionary to a Polars Series is best accomplished using Polars' native replace_strict() or replace() expressions, which run in Rust and are significantly faster than Python-based alternatives like map_elements().

  • Use replace_strict() when you want strict validation and a default for missing keys, replace() when unmapped values should remain unchanged, and a join when working with very large lookup tables.
  • Reserve map_elements() with lambda functions only for complex transformation logic that cannot be expressed with the built-in methods.