How to Delete a CSV Column in Python

Removing columns from CSV files is a common data cleaning task. Whether you are eliminating personally identifiable information, stripping out redundant fields, or discarding noisy data before analysis, Python provides several approaches depending on your environment, file size, and requirements.

In this guide, you will learn how to delete CSV columns using Pandas, the built-in csv module, and several filtering techniques. Each method is explained with clear examples and guidance on when to use it.

Consider the data.csv file as input for the following examples:

data.csv
User_ID,Name,Email,Score,Temp_Col,Debug_Info
101,Alice,a@x.com,85,tmp,dbg
102,Bob,b@x.com,92,tmp,dbg

Using Pandas (Recommended)

Pandas provides the most straightforward and efficient approach for column removal:

import pandas as pd

# Read the CSV
df = pd.read_csv("data.csv")
print("Before:")
print(df.head())

# Drop a single column
df = df.drop(columns=["User_ID"])

# Drop multiple columns
df = df.drop(columns=["Temp_Col", "Debug_Info"])

# Save the result
df.to_csv("clean.csv", index=False)
print("\nAfter:")
print(df.head())

Example output:

Before:
   User_ID   Name    Email  Score Temp_Col Debug_Info
0      101  Alice  a@x.com     85      tmp        dbg
1      102    Bob  b@x.com     92      tmp        dbg

After:
    Name    Email  Score
0  Alice  a@x.com     85
1    Bob  b@x.com     92

The drop() method returns a new DataFrame by default. You can also modify in place without reassignment:

df.drop(columns=["User_ID"], inplace=True)

Filtering Columns During Import

For better memory efficiency, exclude unwanted columns at load time rather than loading everything and dropping afterward:

import pandas as pd

# Only load specific columns
df = pd.read_csv("data.csv", usecols=["Name", "Email", "Score"])

# Or use a function to exclude specific columns
df = pd.read_csv(
    "data.csv",
    usecols=lambda col: col not in ["User_ID", "Internal_Code"]
)

print(df.columns.tolist())

Output:

['Name', 'Email', 'Score', 'Temp_Col', 'Debug_Info']

tip

Using usecols is more memory-efficient than loading the entire file and dropping columns afterward. This makes a significant difference with large files that have many unwanted columns.

Dropping Columns by Pattern

When column names follow a naming convention, you can remove them using string matching or regular expressions:

import pandas as pd

df = pd.read_csv("data.csv")

# Drop columns starting with "Unnamed"
df = df.loc[:, ~df.columns.str.startswith("Unnamed")]

# Drop columns containing "temp" (case-insensitive)
df = df.loc[:, ~df.columns.str.contains("temp", case=False)]

# Keep columns that do NOT start with "debug_"
df = df.filter(regex=r"^(?!debug_)")

# Drop all string columns, keeping only numeric ones
df = df.select_dtypes(exclude=["object"])

The ~ operator negates the boolean mask, so ~df.columns.str.startswith("Unnamed") selects columns that do not start with "Unnamed".

Dropping Columns by Position

When column names are unknown, unreliable, or auto-generated, you can drop columns by their numeric index:

import pandas as pd

# Drop the first column
df = pd.read_csv("data.csv")
df = df.iloc[:, 1:]
print(f"{df}\n")

# Drop the last column
df = pd.read_csv("data.csv")
df = df.iloc[:, :-1]
print(f"{df}\n")

# Drop columns at specific positions (0-indexed)
df = pd.read_csv("data.csv")
df = df.drop(df.columns[[0, 2, 5]], axis=1)
print(f"{df}\n")

# Keep only columns at positions 1 through 3
df = pd.read_csv("data.csv")
df = df.iloc[:, 1:4]
print(f"{df}\n")

Output:

    Name    Email  Score Temp_Col Debug_Info
0  Alice  a@x.com     85      tmp        dbg
1    Bob  b@x.com     92      tmp        dbg

   User_ID   Name    Email  Score Temp_Col
0      101  Alice  a@x.com     85      tmp
1      102    Bob  b@x.com     92      tmp

    Name  Score Temp_Col
0  Alice     85      tmp
1    Bob     92      tmp

    Name    Email  Score
0  Alice  a@x.com     85
1    Bob  b@x.com     92

Handling Missing Columns Gracefully

If you try to drop a column that does not exist, Pandas raises a KeyError by default:

import pandas as pd

df = pd.DataFrame({"Name": ["Alice"], "Score": [85]})

# This raises KeyError because "User_ID" does not exist
df = df.drop(columns=["User_ID"])

Output:

KeyError: "['User_ID'] not found in axis"

To avoid this, use the errors="ignore" parameter:

import pandas as pd

df = pd.DataFrame({"Name": ["Alice"], "Score": [85]})

df = df.drop(columns=["User_ID", "Maybe_Missing"], errors="ignore")
print(df.columns.tolist())

Output:

['Name', 'Score']

warning

Using errors="ignore" silently skips missing columns. This is convenient but can hide real issues, such as a column being renamed or the wrong file being loaded. When data integrity matters, check for column existence explicitly:

columns_to_drop = ["User_ID", "Maybe_Missing"]
existing = [col for col in columns_to_drop if col in df.columns]
df = df.drop(columns=existing)

Using the Standard Library `csv` Module

For environments where Pandas is not available, the built-in csv module handles column removal without any external dependencies:

import csv

columns_to_drop = {"User_ID", "Internal_Notes"}

with open("data.csv", "r", newline="") as infile:
    reader = csv.DictReader(infile)

    fieldnames = [col for col in reader.fieldnames if col not in columns_to_drop]

    with open("output.csv", "w", newline="") as outfile:
        writer = csv.DictWriter(outfile, fieldnames=fieldnames)
        writer.writeheader()

        for row in reader:
            filtered_row = {key: row[key] for key in fieldnames}
            writer.writerow(filtered_row)

For positional column removal when headers are absent or unreliable:

import csv

columns_to_keep = [0, 2, 3]  # Keep first, third, and fourth columns

with open("data.csv", "r", newline="") as infile:
    reader = csv.reader(infile)

    with open("output.csv", "w", newline="") as outfile:
        writer = csv.writer(outfile)

        for row in reader:
            filtered_row = [row[i] for i in columns_to_keep]
            writer.writerow(filtered_row)

The csv module processes files row by row, so memory usage remains low even for very large files.

Processing Large Files in Chunks

For files too large to fit in memory, Pandas can process them in chunks:

import pandas as pd

columns_to_keep = ["Name", "Email", "Score"]
chunk_size = 10_000

first_chunk = True
for chunk in pd.read_csv("large_file.csv", usecols=columns_to_keep, chunksize=chunk_size):
    mode = "w" if first_chunk else "a"
    header = first_chunk
    chunk.to_csv("output.csv", mode=mode, header=header, index=False)
    first_chunk = False

print("Processing complete")

The first chunk is written with headers in write mode ("w"), and all subsequent chunks are appended without headers ("a" mode with header=False).

Complete Reusable Function

A production-ready function that handles both drop and keep scenarios:

import pandas as pd
from typing import Optional

def remove_csv_columns(
    input_path: str,
    output_path: str,
    columns_to_drop: Optional[list[str]] = None,
    columns_to_keep: Optional[list[str]] = None
) -> int:
    """
    Remove columns from a CSV file.

    Specify either columns_to_drop OR columns_to_keep, not both.
    Returns the number of rows processed.
    """
    if columns_to_drop and columns_to_keep:
        raise ValueError("Specify either columns_to_drop or columns_to_keep, not both")

    if columns_to_keep:
        df = pd.read_csv(input_path, usecols=columns_to_keep)
    else:
        df = pd.read_csv(input_path)
        if columns_to_drop:
            df = df.drop(columns=columns_to_drop, errors="ignore")

    df.to_csv(output_path, index=False)
    return len(df)

# Usage
rows = remove_csv_columns(
    "data.csv",
    "clean.csv",
    columns_to_drop=["User_ID", "SSN", "Internal_Code"]
)
print(f"Processed {rows} rows")

Example output:

Processed 1500 rows

Method Comparison

Method	Speed	Memory Usage	Dependencies
`pd.read_csv(usecols=...)`	Fast	Low	Pandas
`df.drop(columns=...)`	Fast	Higher (full load first)	Pandas
Pandas chunked processing	Moderate	Low	Pandas
`csv` module	Slower	Lowest	None (stdlib)

Conclusion

Use usecols when you know which columns to keep upfront, as it is the most memory-efficient Pandas approach.
Use df.drop(columns=...) when exploring data interactively or when the columns to remove are determined at runtime.
For pattern-based removal, combine df.columns.str methods with boolean indexing to filter by naming conventions.
Use the csv module in constrained environments without Pandas or when you need minimal memory overhead.
For very large files, process in chunks to avoid loading the entire dataset into memory at once.

Using Pandas (Recommended)​

Filtering Columns During Import​

Dropping Columns by Pattern​

Dropping Columns by Position​

Handling Missing Columns Gracefully​

Using the Standard Library csv Module​

Processing Large Files in Chunks​

Complete Reusable Function​

Method Comparison​

Conclusion​

Table of Contents

Using Pandas (Recommended)

Filtering Columns During Import

Dropping Columns by Pattern

Dropping Columns by Position

Handling Missing Columns Gracefully

Using the Standard Library `csv` Module

Processing Large Files in Chunks

Complete Reusable Function

Method Comparison

Conclusion