Python Pandas: How to Read Space-Delimited Files in Pandas

Not all data files use commas or tabs as separators. Many datasets - especially those generated by scientific instruments, log files, command-line tools, or legacy systems - use spaces to separate values. These space-delimited files can be tricky to parse because the number of spaces between columns may vary, making simple string splitting unreliable.

In this guide, you'll learn how to read space-delimited files into Pandas DataFrames using multiple methods, handle files with inconsistent spacing, and avoid common parsing issues.

What Is a Space-Delimited File?

A space-delimited file organizes data into rows and columns where spaces act as the separator between values. Each line represents one record:

Sample employees.txt:

Name Age City
Alice 25 NewYork
Bob 30 LosAngeles
Charlie 28 Chicago
Diana 35 Houston

Unlike CSV files (comma-separated) or TSV files (tab-separated), space-delimited files use one or more space characters to separate fields.

Method 1: Using `pd.read_csv()` with `sep=' '`

Despite its name, pd.read_csv() can handle any delimiter, not just commas. Set sep=' ' to specify a single space as the separator:

import pandas as pd

df = pd.read_csv('employees.txt', sep=' ')
print(df)

Output:

      Name  Age        City
  Alice   25     NewYork
    Bob   30  LosAngeles
Charlie   28     Chicago
  Diana   35     Houston

This works perfectly when columns are separated by exactly one space.

Method 2: Using `pd.read_table()` with `sep=' '`

The pd.read_table() function works identically to pd.read_csv() but defaults to tab separation. You can override it with sep=' ':

import pandas as pd

df = pd.read_table('employees.txt', sep=' ')
print(df)

Output:

      Name  Age        City
  Alice   25     NewYork
    Bob   30  LosAngeles
Charlie   28     Chicago
  Diana   35     Houston

Both read_csv() and read_table() produce identical results when given the same sep parameter.

Handling Multiple or Irregular Spaces

Real-world space-delimited files often have inconsistent spacing - some columns might be separated by one space, others by two, three, or more. This is common in files generated by print statements, fixed-width formatting, or command-line utilities.

Sample data_irregular.txt:

Name    Age   City
Alice   25    NewYork
Bob     30    LosAngeles
Charlie 28    Chicago

Using sep=' ' (single space) on this file produces incorrect results:

import pandas as pd

# ❌ Single space separator fails with irregular spacing
df = pd.read_csv('data_irregular.txt', sep=' ')
print(df)

Output:

pandas.errors.ParserError: Error tokenizing data. C error: Expected 8 fields in line 3, saw 10

The Fix: Use `sep='\s+'` (Regex for One or More Whitespace Characters)

The regex pattern \s+ matches one or more whitespace characters (spaces, tabs, etc.), handling any amount of spacing between columns:

import pandas as pd

df = pd.read_csv('data_irregular.txt', sep='\s+', engine='python')
print(df)

Output:

      Name  Age        City
  Alice   25     NewYork
    Bob   30  LosAngeles
Charlie   28     Chicago

tip

sep='\s+' is the recommended approach for space-delimited files because it handles both single and multiple spaces. It's the most robust option and works regardless of how many spaces exist between columns.

Why `engine='python'` Is Needed

When using a regex pattern as the separator, Pandas may display a warning:

ParserWarning: Falling back to the 'python' engine because the 'c' parser does not support
regex separators; you can avoid this warning by specifying engine='python'.

Set engine='python' explicitly to suppress this warning. The Python engine is slightly slower than the C engine but fully supports regex separators.

Reading Files Without a Header Row

If your space-delimited file has no header row, use header=None and optionally assign column names with names:

Sample data_no_header.txt:

Alice 25 NewYork
Bob 30 LosAngeles
Charlie 28 Chicago

import pandas as pd

df = pd.read_csv(
    'data_no_header.txt',
    sep='\s+',
    engine='python',
    header=None,
    names=['Name', 'Age', 'City']
)
print(df)

Output:

      Name  Age        City
  Alice   25     NewYork
    Bob   30  LosAngeles
Charlie   28     Chicago

Handling Values That Contain Spaces

A significant challenge with space-delimited files is when data values themselves contain spaces (e.g., "New York" or "Los Angeles"). The parser can't distinguish between a delimiter space and a space within a value.

Sample cities_with_spaces.txt:

Name Age City
Alice 25 New York
Bob 30 Los Angeles

import pandas as pd

# ❌ This splits "New York" into two separate columns
df = pd.read_csv('cities_with_spaces.txt', sep='\s+', engine='python')
print(df)

Output:

       Name  Age     City
Alice    25  New     York
Bob      30  Los  Angeles

warning

Space-delimited files cannot reliably handle values containing spaces. If your data has multi-word values, consider using a different delimiter (comma, tab, pipe) or a fixed-width format.

Fix 1: Use Fixed-Width Format with `read_fwf()`

If your file uses fixed-width columns (each column occupies a specific number of characters), use pd.read_fwf():

import pandas as pd

df = pd.read_fwf('cities_with_spaces.txt')
print(df)

Output:

    Name  Age         City
0  Alice   25     New York
1    Bob   30  Los Angeles

read_fwf() infers column boundaries based on whitespace patterns and handles multi-word values correctly in many cases.

Fix 2: Specify Column Widths Explicitly

For more control, define the exact width of each column:

import pandas as pd

df = pd.read_fwf(
    'cities_with_spaces.txt',
    colspecs=[(0, 10), (10, 14), (14, 30)]
)
print(df)

Output:

    Name  Age         City
0  Alice   25     New York
1    Bob   30  Los Angeles

Skipping Comment Lines

Some space-delimited files include comment lines (starting with # or another character). Use the comment parameter to skip them:

Sample data_with_comments.txt:

# This is a comment
Name Age City
Alice 25 NewYork
# Another comment
Bob 30 Chicago

import pandas as pd

df = pd.read_csv(
    'data_with_comments.txt',
    sep='\s+',
    engine='python',
    comment='#'
)
print(df)

Output:

    Name  Age     City
0  Alice   25  NewYork
1    Bob   30  Chicago

Complete Example with Multiple Options

Here's a comprehensive example combining several useful parameters:

import pandas as pd

df = pd.read_csv(
    'data.txt',
    sep='\s+',             # Handle any amount of whitespace
    engine='python',       # Required for regex separators
    header=0,              # First row is the header
    comment='#',           # Skip comment lines
    na_values=['NA', '-'], # Treat these as missing values
    dtype={'Age': int},    # Specify data types
    skiprows=[],           # Skip specific rows if needed
    encoding='utf-8'       # File encoding
)

print(df)
print(f"\nShape: {df.shape}")
print(f"Columns: {list(df.columns)}")

Summary

Method	Syntax	Best For
Single space separator	`sep=' '`	Files with exactly one space between columns
Regex whitespace	`sep='\s+'`	Files with variable/multiple spaces (recommended)
`read_fwf()`	`pd.read_fwf(file)`	Fixed-width files or values containing spaces

When reading space-delimited files in Pandas:

Use sep='\s+' as the default approach - it handles both single and multiple spaces reliably.
Set engine='python' to avoid parser warnings when using regex separators.
Use pd.read_fwf() when your data values contain spaces, as standard space-delimited parsing will fail.
Always inspect the first few lines of your file to understand the spacing pattern before choosing a parsing method.

What Is a Space-Delimited File?​

Method 1: Using pd.read_csv() with sep=' '​

Method 2: Using pd.read_table() with sep=' '​

Handling Multiple or Irregular Spaces​

The Fix: Use sep='\s+' (Regex for One or More Whitespace Characters)​

Why engine='python' Is Needed​

Reading Files Without a Header Row​

Handling Values That Contain Spaces​

Fix 1: Use Fixed-Width Format with read_fwf()​

Fix 2: Specify Column Widths Explicitly​

Skipping Comment Lines​

Complete Example with Multiple Options​

Summary​

Table of Contents

What Is a Space-Delimited File?

Method 1: Using `pd.read_csv()` with `sep=' '`

Method 2: Using `pd.read_table()` with `sep=' '`

Handling Multiple or Irregular Spaces

The Fix: Use `sep='\s+'` (Regex for One or More Whitespace Characters)

Why `engine='python'` Is Needed

Reading Files Without a Header Row

Handling Values That Contain Spaces

Fix 1: Use Fixed-Width Format with `read_fwf()`

Fix 2: Specify Column Widths Explicitly

Skipping Comment Lines

Complete Example with Multiple Options

Summary