Skip to main content

Python Polars: How to Add a Column to a Polars DataFrame Using .with_columns()

Polars is a high-performance DataFrame library for Python that's becoming a popular alternative to pandas, especially for large datasets. One of its most frequently used methods is .with_columns(), which lets you add one or more new columns to a DataFrame.

Unlike pandas, Polars follows an immutable design: .with_columns() returns a new DataFrame with the added columns rather than modifying the original. This makes operations safer, more predictable, and easier to chain together.

This guide covers the most common ways to add columns using .with_columns(), from simple constant values to conditional logic and custom functions.

Installation

If you haven't installed Polars yet:

pip install polars

Basic Syntax

The .with_columns() method accepts one or more expressions that define new columns:

import polars as pl

df = pl.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35]
})

# Add a new column derived from an existing one
new_df = df.with_columns(
(pl.col("Age") + 5).alias("Age_in_5_Years")
)

print(new_df)

Output:

shape: (3, 3)
┌─────────┬─────┬────────────────┐
│ Name ┆ Age ┆ Age_in_5_Years │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 │
╞═════════╪═════╪════════════════╡
│ Alice ┆ 25 ┆ 30 │
│ Bob ┆ 30 ┆ 35 │
│ Charlie ┆ 35 ┆ 40 │
└─────────┴─────┴────────────────┘

Key elements:

  • pl.col("Age"): references the existing Age column.
  • + 5: applies an arithmetic operation.
  • .alias("Age_in_5_Years"): names the new column.
note

The original df is not modified. .with_columns() always returns a new DataFrame, preserving the original data.

Adding a Constant Value Column

Add a column where every row has the same value using pl.lit() (literal):

import polars as pl

df = pl.DataFrame({
"Product": ["Laptop", "Phone", "Tablet"],
"Price": [999, 699, 449]
})

new_df = df.with_columns(
pl.lit("USD").alias("Currency"),
pl.lit(True).alias("In_Stock")
)

print(new_df)

Output:

shape: (3, 4)
┌─────────┬───────┬──────────┬──────────┐
│ Product ┆ Price ┆ Currency ┆ In_Stock │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str ┆ bool │
╞═════════╪═══════╪══════════╪══════════╡
│ Laptop ┆ 999 ┆ USD ┆ true │
│ Phone ┆ 699 ┆ USD ┆ true │
│ Tablet ┆ 449 ┆ USD ┆ true │
└─────────┴───────┴──────────┴──────────┘
tip

Always use pl.lit() for constant values. Passing a raw Python value directly will raise an error in most contexts.

Creating a Column from Multiple Existing Columns

Perform operations that combine data from multiple columns:

import polars as pl

df = pl.DataFrame({
"Product": ["Laptop", "Phone", "Tablet"],
"Price": [999, 699, 449],
"Quantity": [2, 5, 3]
})

new_df = df.with_columns(
(pl.col("Price") * pl.col("Quantity")).alias("Total_Cost")
)

print(new_df)

Output:

shape: (3, 4)
┌─────────┬───────┬──────────┬────────────┐
│ Product ┆ Price ┆ Quantity ┆ Total_Cost │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ i64 │
╞═════════╪═══════╪══════════╪════════════╡
│ Laptop ┆ 999 ┆ 2 ┆ 1998 │
│ Phone ┆ 699 ┆ 5 ┆ 3495 │
│ Tablet ┆ 449 ┆ 3 ┆ 1347 │
└─────────┴───────┴──────────┴────────────┘

Conditional Column Creation

Use pl.when().then().otherwise() to create columns based on conditions, similar to SQL's CASE WHEN or pandas' np.where():

import polars as pl

df = pl.DataFrame({
"Product": ["Laptop", "Phone", "Tablet"],
"Price": [999, 699, 449]
})

new_df = df.with_columns(
pl.when(pl.col("Price") > 500)
.then(pl.lit("Premium"))
.otherwise(pl.lit("Budget"))
.alias("Category")
)

print(new_df)

Output:

shape: (3, 3)
┌─────────┬───────┬──────────┐
│ Product ┆ Price ┆ Category │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════════╪═══════╪══════════╡
│ Laptop ┆ 999 ┆ Premium │
│ Phone ┆ 699 ┆ Premium │
│ Tablet ┆ 449 ┆ Budget │
└─────────┴───────┴──────────┘

Multiple Conditions

Chain multiple .when() clauses for more complex logic:

import polars as pl

df = pl.DataFrame({
"Name": ["Alice", "Bob", "Charlie", "Diana"],
"Score": [95, 72, 58, 83]
})

new_df = df.with_columns(
pl.when(pl.col("Score") >= 90).then(pl.lit("A"))
.when(pl.col("Score") >= 80).then(pl.lit("B"))
.when(pl.col("Score") >= 70).then(pl.lit("C"))
.otherwise(pl.lit("F"))
.alias("Grade")
)

print(new_df)

Output:

shape: (4, 3)
┌─────────┬───────┬───────┐
│ Name ┆ Score ┆ Grade │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════════╪═══════╪═══════╡
│ Alice ┆ 95 ┆ A │
│ Bob ┆ 72 ┆ C │
│ Charlie ┆ 58 ┆ F │
│ Diana ┆ 83 ┆ B │
└─────────┴───────┴───────┘

Adding Multiple Columns at Once

Pass multiple expressions to .with_columns() to add several columns in a single operation:

import polars as pl

df = pl.DataFrame({
"Product": ["Laptop", "Phone", "Tablet"],
"Price": [999, 699, 449]
})

new_df = df.with_columns(
(pl.col("Price") * 0.1).alias("Tax"),
(pl.col("Price") * 1.1).alias("Price_with_Tax"),
(pl.col("Price") * 0.9).alias("Discounted_Price")
)

print(new_df)

Output:

shape: (3, 5)
┌─────────┬───────┬──────┬────────────────┬──────────────────┐
│ Product ┆ Price ┆ Tax ┆ Price_with_Tax ┆ Discounted_Price │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ f64 ┆ f64 ┆ f64 │
╞═════════╪═══════╪══════╪════════════════╪══════════════════╡
│ Laptop ┆ 999 ┆ 99.9 ┆ 1098.9 ┆ 899.1 │
│ Phone ┆ 699 ┆ 69.9 ┆ 768.9 ┆ 629.1 │
│ Tablet ┆ 449 ┆ 44.9 ┆ 493.9 ┆ 404.1 │
└─────────┴───────┴──────┴────────────────┴──────────────────┘
tip

Adding multiple columns in a single .with_columns() call is more efficient than chaining multiple calls, as Polars can optimize the operations together.

Using Custom Functions with .map_elements()

For complex transformations that can't be expressed with built-in expressions, use .map_elements() to apply a custom Python function:

import polars as pl

df = pl.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35]
})

def age_category(age):
if age < 30:
return "Young"
elif age < 35:
return "Middle"
else:
return "Senior"

new_df = df.with_columns(
pl.col("Age").map_elements(age_category, return_dtype=pl.Utf8).alias("Category")
)

print(new_df)

Output:

shape: (3, 3)
┌─────────┬─────┬──────────┐
│ Name ┆ Age ┆ Category │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════════╪═════╪══════════╡
│ Alice ┆ 25 ┆ Young │
│ Bob ┆ 30 ┆ Middle │
│ Charlie ┆ 35 ┆ Senior │
└─────────┴─────┴──────────┘
Performance note

.map_elements() runs a Python function row by row, which is significantly slower than native Polars expressions. Whenever possible, use built-in expressions like pl.when().then().otherwise() instead: they run in optimized Rust and are orders of magnitude faster.

# Slower: Python function
pl.col("Age").map_elements(age_category, return_dtype=pl.Utf8)

# Faster: Native Polars expressions
pl.when(pl.col("Age") < 30).then(pl.lit("Young"))
.when(pl.col("Age") < 35).then(pl.lit("Middle"))
.otherwise(pl.lit("Senior"))

Using String and Date Operations

Polars provides rich expression APIs for strings, dates, and other types:

import polars as pl

df = pl.DataFrame({
"Name": ["Alice Smith", "Bob Jones", "Charlie Brown"],
"Email": ["alice@test.com", "bob@test.com", "charlie@test.com"]
})

new_df = df.with_columns(
pl.col("Name").str.to_uppercase().alias("Name_Upper"),
pl.col("Name").str.split(" ").list.first().alias("First_Name"),
pl.col("Email").str.contains("test").alias("Is_Test_Email")
)

print(new_df)

Output:

shape: (3, 5)
┌───────────────┬──────────────────┬───────────────┬────────────┬───────────────┐
│ Name ┆ Email ┆ Name_Upper ┆ First_Name ┆ Is_Test_Email │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ bool │
╞═══════════════╪══════════════════╪═══════════════╪════════════╪═══════════════╡
│ Alice Smith ┆ alice@test.com ┆ ALICE SMITH ┆ Alice ┆ true │
│ Bob Jones ┆ bob@test.com ┆ BOB JONES ┆ Bob ┆ true │
│ Charlie Brown ┆ charlie@test.com ┆ CHARLIE BROWN ┆ Charlie ┆ true │
└───────────────┴──────────────────┴───────────────┴────────────┴───────────────┘

Quick Reference

TaskExpression
Column from existing(pl.col("A") + pl.col("B")).alias("C")
Constant valuepl.lit("value").alias("C")
Conditionalpl.when(cond).then(val).otherwise(val).alias("C")
Custom functionpl.col("A").map_elements(fn, return_dtype=...).alias("C")
String operationspl.col("A").str.to_uppercase().alias("C")
Multiple columnsPass multiple expressions separated by commas

Conclusion

The .with_columns() method is one of the most powerful tools in Polars for DataFrame manipulation.

  • It supports constant values, arithmetic operations, conditional logic, string transformations, and custom functions, all while maintaining Polars' immutable design.
  • For best performance, prefer native Polars expressions over .map_elements() with Python functions.
  • By combining multiple column additions into a single .with_columns() call, you let Polars optimize the execution plan for maximum efficiency.