Skip to main content

How to Search and Replace Text in a File in Python

Searching and replacing text within files is a common task in Python: whether you're updating configuration files, processing log data, cleaning datasets, or performing batch modifications across multiple documents. Python provides several approaches, from simple string replacement to powerful regex-based pattern matching.

In this guide, you'll learn four methods to search and replace text in files, understand when to use each one, and handle edge cases like large files and pattern-based replacements.

Sample File

All examples below use a file called sample.txt with the following content:

Hello World! This is a dummy text file.
It contains some dummy data for testing.
The word dummy appears multiple times.

Method 1: Using open() and replace() (Simplest Approach)

The most straightforward method reads the entire file into memory, performs the replacement, and writes the result back:

search_text = "dummy"
replace_text = "sample"

# Read the file
with open('sample.txt', 'r') as file:
content = file.read()

# Replace the text
content = content.replace(search_text, replace_text)

# Write the modified content back
with open('sample.txt', 'w') as file:
file.write(content)

print("Text replaced successfully")

Output:

Text replaced successfully

File content after replacement:

Hello World! This is a sample text file.
It contains some sample data for testing.
The word sample appears multiple times.

How It Works

  1. open('sample.txt', 'r') opens the file in read mode.
  2. file.read() loads the entire file content into a string.
  3. content.replace(search_text, replace_text) replaces all occurrences.
  4. open('sample.txt', 'w') opens the file in write mode (overwrites existing content).
  5. file.write(content) saves the modified text.
caution

This method loads the entire file into memory. For very large files (hundreds of MB or more), this can consume significant RAM. Use the fileinput method (Method 4) for large files instead.

Method 2: Using pathlib (Modern, Clean Syntax)

Python's built-in pathlib module (available since Python 3.4) provides an object-oriented interface for file operations. It makes the code cleaner and more readable:

from pathlib import Path

def replace_in_file(filepath, search_text, replace_text):
file = Path(filepath)
content = file.read_text()
content = content.replace(search_text, replace_text)
file.write_text(content)
return "Text replaced successfully"

result = replace_in_file('sample.txt', 'dummy', 'sample')
print(result)

Output:

Text replaced successfully
tip

pathlib is part of Python's standard library: no installation needed. It replaces the older pathlib2 package. Use read_text() and write_text() for simple file operations without explicit open()/close() management.

Specifying File Encoding

When working with files that contain special characters, always specify the encoding:

from pathlib import Path

file = Path('sample.txt')
content = file.read_text(encoding='utf-8')
content = content.replace('dummy', 'sample')
file.write_text(content, encoding='utf-8')

Method 3: Using the re Module (Regex Pattern Matching)

The re module enables pattern-based search and replace, which is far more powerful than simple string replacement. Use this method when you need to match variations, patterns, or complex text structures.

Basic Regex Replacement

import re

search_pattern = "dummy"
replace_text = "sample"

with open('sample.txt', 'r') as file:
content = file.read()

# Replace all occurrences matching the pattern
content = re.sub(search_pattern, replace_text, content)

with open('sample.txt', 'w') as file:
file.write(content)

print("Text replaced successfully")

Case-Insensitive Replacement

Replace text regardless of capitalization:

import re

with open('sample.txt', 'r') as file:
content = file.read()

# Matches "dummy", "Dummy", "DUMMY", etc.
content = re.sub(r'dummy', 'sample', content, flags=re.IGNORECASE)

with open('sample.txt', 'w') as file:
file.write(content)

Replacing Patterns (Not Just Fixed Text)

Regex shines when you need to match patterns rather than exact strings:

import re

with open('sample.txt', 'r') as file:
content = file.read()

# Replace all email addresses with [REDACTED]
content = re.sub(r'\b[\w.+-]+@[\w-]+\.[\w.]+\b', '[REDACTED]', content)

# Replace all dates in MM/DD/YYYY format with [DATE]
content = re.sub(r'\d{2}/\d{2}/\d{4}', '[DATE]', content)

# Replace multiple spaces with a single space
content = re.sub(r' +', ' ', content)

with open('sample.txt', 'w') as file:
file.write(content)

print("Patterns replaced successfully")

Using r+ Mode (Read and Write in One Open)

You can read and write in a single open() call using r+ mode:

import re

with open('sample.txt', 'r+') as file:
content = file.read()
content = re.sub(r'dummy', 'sample', content)
file.seek(0) # Move cursor to the beginning
file.write(content)
file.truncate() # Remove any leftover content

print("Text replaced successfully")
Why seek(0) and truncate()?
  • file.seek(0) moves the file cursor back to the beginning before writing.
  • file.truncate() removes any remaining old content. Without it, if the replacement text is shorter than the original, leftover characters from the old content will remain at the end of the file.

Method 4: Using fileinput (Line-by-Line, In-Place Editing)

The fileinput module processes files line by line, making it memory-efficient for large files. It also supports in-place editing with automatic backup creation:

from fileinput import FileInput

search_text = "dummy"
replace_text = "sample"

with FileInput('sample.txt', inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(search_text, replace_text), end='')

print("Text replaced successfully")

How It Works

ParameterDescription
inplace=TrueRedirects print() output to the file itself instead of the console
backup='.bak'Creates a backup copy (sample.txt.bak) before modifying the original
end=''Prevents print() from adding extra newlines (lines already contain \n)
tip

The fileinput method is the best choice for large files because it processes one line at a time instead of loading the entire file into memory. The automatic backup feature also provides a safety net.

Replacing Text in Multiple Files

To perform search and replace across multiple files, combine any method with glob:

import glob
from pathlib import Path

search_text = "dummy"
replace_text = "sample"

files = glob.glob('data/*.txt')

for filepath in files:
file = Path(filepath)
content = file.read_text(encoding='utf-8')

if search_text in content:
content = content.replace(search_text, replace_text)
file.write_text(content, encoding='utf-8')
print(f"Updated: {filepath}")
else:
print(f"No match: {filepath}")

Common Mistake: Not Using truncate() with r+ Mode

When using r+ mode and the replacement text is shorter than the original, leftover characters remain:

# ❌ Without truncate(): file may have leftover content
with open('sample.txt', 'r+') as file:
content = file.read()
content = content.replace("replacement", "new") # "new" is shorter
file.seek(0)
file.write(content)
# Missing file.truncate(): old characters remain at the end!

Fix: Always call file.truncate() after writing:

# ✅ With truncate(): file is clean
with open('sample.txt', 'r+') as file:
content = file.read()
content = content.replace("replacement", "new")
file.seek(0)
file.write(content)
file.truncate() # Remove any leftover content

Comparison of Methods

MethodMemory UsagePattern MatchingBackup SupportBest For
open() + replace()Loads entire file❌ (exact match only)❌ (manual)Simple replacements in small files
pathlibLoads entire file❌ (exact match only)❌ (manual)Clean, modern code
re.sub()Loads entire file✅ (regex patterns)❌ (manual)Complex pattern matching
fileinputLine by line❌ (exact match only)✅ (automatic)Large files, batch processing

Summary

To search and replace text in a file in Python:

  • Use open() + replace() for the simplest approach with small files and exact text matches.
  • Use pathlib for cleaner, more modern syntax without sacrificing functionality.
  • Use re.sub() when you need pattern matching: case-insensitive replacement, regex patterns, or complex text structures.
  • Use fileinput for large files that shouldn't be loaded entirely into memory, or when you want automatic backup creation.

Always handle file encoding explicitly with encoding='utf-8', use truncate() when writing with r+ mode, and consider creating backups before modifying important files.