Skip to main content

How to Get the File Name Without the Extension in Python

A common task in file processing is to extract a file's name from its full path and remove its extension (e.g., turning /path/to/document.txt into document). Python provides robust, cross-platform tools for this in its standard library, with the modern pathlib module being the recommended approach, and the classic os.path module serving as a reliable alternative.

This guide will demonstrate the best ways to achieve this, including the modern pathlib approach, the traditional os.path method, and how to handle special cases like files with multiple extensions (e.g., .tar.gz).

Introduced in Python 3.4, the pathlib module provides an object-oriented interface for filesystem paths. The Path object has a .stem attribute that directly provides the final path component without its suffix. This is the cleanest and most recommended method.

Solution:

from pathlib import Path

file_path = "/path/to/some/document.txt"

# Create a Path object
p = Path(file_path)

# The .stem attribute gives the file name without the final extension
file_stem = p.stem

print(f"The file stem is: '{file_stem}'")
print(f"The full name is: '{p.name}'")
print(f"The extension is: '{p.suffix}'")

Output:

The file stem is: 'document'
The full name is: 'document.txt'
The extension is: '.txt'

Method 2: Using os.path.splitext() (Classic Method)

The os.path module is the traditional way to manipulate paths in Python. This approach involves two steps: first getting the base filename from the path, and then splitting the extension from the name.

Solution:

import os

file_path = "/path/to/some/document.txt"

# Step 1: Get the base name of the file from the path
base_name = os.path.basename(file_path)
print(f"Base name: {base_name}")

# Step 2: Split the base name into the stem and the extension
# os.path.splitext() returns a tuple: (stem, extension)
file_stem, extension = os.path.splitext(base_name)

print(f"The file stem is: '{file_stem}'")
print(f"The extension is: '{extension}'")

Output:

Base name: document.txt
The file stem is: 'document'
The extension is: '.txt'

Method 3: Using str.split() (Manual & Brittle)

While it's possible to use string manipulation methods like .split(), this approach is not recommended for parsing file paths. It is not cross-platform (it assumes a specific path separator like /) and can fail unexpectedly with file names that contain dots.

Example of the brittle approach:

file_path = "/path/to/some/document.txt"

# This assumes a '/' separator and will fail on Windows
file_name = file_path.split('/')[-1]

# This will fail if the filename contains dots (e.g., 'archive.v1.zip')
file_stem = file_name.split('.')[0]

print(f"The file stem is: '{file_stem}'")

Output:

The file stem is: 'document'
warning

Avoid using str.split() to parse file paths. It is not robust. Use pathlib or os.path instead, as they are designed to handle different operating systems and edge cases correctly.

Handling Multi-Part Extensions (e.g., .tar.gz)

A common challenge is with compressed files that have multi-part extensions, like .tar.gz or .tar.bz2. Both pathlib.Path.stem and os.path.splitext will only remove the final suffix.

Example of the problem:

from pathlib import Path
import os

archive_path = "my_archive.tar.gz"

# pathlib's .stem only removes the last suffix
pathlib_stem = Path(archive_path).stem
print(f"pathlib.Path.stem result: '{pathlib_stem}'")

# os.path.splitext also only removes the last suffix
os_path_stem = os.path.splitext(archive_path)[0]
print(f"os.path.splitext result: '{os_path_stem}'")

Output:

pathlib.Path.stem result: 'my_archive.tar'
os.path.splitext result: 'my_archive.tar'

To get only the base name (my_archive), you must combine one of the robust methods with a manual string split.

Solution: the best approach is to get the full filename using a reliable method (.name from pathlib) and then split it on the first dot.

from pathlib import Path

archive_path = "my_archive.tar.gz"

# Get the full filename safely
full_name = Path(archive_path).name

# Split the full name at the first dot and take the first part
base_name = full_name.split('.', 1)[0]

print(f"The base name is: '{base_name}'")

Output:

The base name is: 'my_archive'

Conclusion

MethodBest ForExample
pathlib.Path.stemModern Python (3.4+). Cleanest, most readable, and recommended approach.Path(my_path).stem
os.path.splitext()Legacy code or projects needing backward compatibility with Python < 3.4.os.path.splitext(os.path.basename(my_path))[0]
Hybrid ApproachHandling multi-part extensions like .tar.gz.Path(my_path).name.split('.', 1)[0]

For new projects, you should always prefer the pathlib module for its clear, object-oriented API. For special cases like multi-part extensions, combine it with str.split() for a robust solution.