Skip to main content

How to Split a File into a List in Python

Reading a file and splitting its content into a list is one of the most common file-handling tasks in Python. Whether you need each line as a separate list element, each word as an individual item, or the file divided into smaller chunks, Python provides multiple approaches. This guide covers several methods for splitting file content into lists, from simple built-in string methods to memory-efficient generators.

Sample File

All examples use a text file called example.txt with the following content:

This is line 1
This is line 2
This is line 3

Method 1: Using splitlines()

The splitlines() method reads the entire file content as a single string and splits it at line boundaries, returning a list where each element is one line:

with open("example.txt", "r") as file:
content = file.read()
lines = content.splitlines()

print(lines)

Output:

['This is line 1', 'This is line 2', 'This is line 3']
note

splitlines() automatically handles different line endings (\n, \r\n, \r), making it reliable across operating systems. No trailing newline characters are included in the results.

Method 2: Using readlines()

The readlines() method reads all lines at once and returns a list. However, each line retains its trailing newline character:

with open("example.txt", "r") as file:
lines = file.readlines()

print(lines)

Output:

['This is line 1\n', 'This is line 2\n', 'This is line 3']

To remove the newline characters, combine readlines() with strip():

with open("example.txt", "r") as file:
lines = [line.strip() for line in file.readlines()]

print(lines)

Output:

['This is line 1', 'This is line 2', 'This is line 3']
tip

read().splitlines() and [line.strip() for line in file.readlines()] produce the same result. The splitlines() approach is cleaner because it handles newline removal automatically.

Method 3: Using List Comprehension with strip()

Instead of calling readlines() explicitly, you can iterate directly over the file object inside a list comprehension. This is both concise and memory-friendly since it processes one line at a time:

with open("example.txt", "r") as file:
lines = [line.strip() for line in file]

print(lines)

Output:

['This is line 1', 'This is line 2', 'This is line 3']

The strip() method removes both leading and trailing whitespace, including \n. Use rstrip('\n') if you only want to remove the trailing newline while preserving leading spaces.

Method 4: Splitting Each Line into Words with split()

If you need individual words rather than whole lines, use split() on each line. By default, split() separates text on any whitespace:

with open("example.txt", "r") as file:
for line in file:
words = line.split()
print(words)

Output:

['This', 'is', 'line', '1']
['This', 'is', 'line', '2']
['This', 'is', 'line', '3']

Getting All Words as a Single Flat List

with open("example.txt", "r") as file:
all_words = file.read().split()

print(all_words)

Output:

['This', 'is', 'line', '1', 'This', 'is', 'line', '2', 'This', 'is', 'line', '3']
note

read().split() reads the entire file and splits on all whitespace (spaces, tabs, newlines), producing a single flat list of every word.

Method 5: Using a Generator for Large Files

For very large files that don't fit comfortably in memory, a generator processes one line at a time without loading the entire file:

def read_lines(filepath):
"""Generator that yields stripped lines from a file."""
with open(filepath, "r") as file:
for line in file:
yield line.strip()


# Use the generator
for line in read_lines("example.txt"):
print(line)

Output:

This is line 1
This is line 2
This is line 3

If you need the result as a list, wrap the generator in list():

def read_lines(filepath):
"""Generator that yields stripped lines from a file."""
with open(filepath, "r") as file:
for line in file:
yield line.strip()

lines = list(read_lines("example.txt"))
print(lines)

Output:

['This is line 1', 'This is line 2', 'This is line 3']
info

Generators are ideal for files that are hundreds of megabytes or larger. They process data lazily, one line at a time, keeping memory usage constant regardless of file size. For smaller files, list comprehension is simpler and equally effective.

Method 6: Splitting a File into Multiple Files

Sometimes you need to split a large file into smaller files rather than a list. This example divides a file into two halves:

# Read all lines
with open("example.txt", "r") as file:
lines = file.readlines()

midpoint = len(lines) // 2

# Write the first half
with open("first_half.txt", "w") as file1:
file1.writelines(lines[:midpoint])

# Write the second half
with open("second_half.txt", "w") as file2:
file2.writelines(lines[midpoint:])

print(f"Total lines: {len(lines)}")
print(f"First half: {midpoint} lines")
print(f"Second half: {len(lines) - midpoint} lines")

Output:

Total lines: 3
First half: 1 lines
Second half: 2 lines

Splitting into N Equal Parts

For more flexibility, split into any number of chunks:

def split_file(filepath, num_parts):
"""Split a file into num_parts smaller files."""
with open(filepath, "r") as file:
lines = file.readlines()

chunk_size = len(lines) // num_parts
remainder = len(lines) % num_parts

start = 0
for i in range(num_parts):
# Distribute remainder lines across first chunks
end = start + chunk_size + (1 if i < remainder else 0)

output_file = f"part_{i + 1}.txt"
with open(output_file, "w") as out:
out.writelines(lines[start:end])

print(f"Wrote {end - start} lines to {output_file}")
start = end


split_file("example.txt", 2)

Output:

Wrote 2 lines to part_1.txt
Wrote 1 lines to part_2.txt

Common Mistake: Forgetting to Strip Newlines

A frequent issue is using readlines() without stripping, then comparing or processing strings that contain hidden \n characters:

with open("example.txt", "r") as file:
lines = file.readlines()

# WRONG: comparison fails because of trailing \n
if lines[0] == "This is line 1":
print("Match!")
else:
print(f"No match: {repr(lines[0])}")

Output:

No match: 'This is line 1\n'
note

The string contains a trailing \n that prevents the comparison from succeeding.

The correct approach

with open("example.txt", "r") as file:
lines = [line.strip() for line in file]

# CORRECT: stripped lines compare as expected
if lines[0] == "This is line 1":
print("Match!")

Output:

Match!
warning

Always strip newline characters when reading file lines into a list. Hidden \n characters cause string comparisons, dictionary lookups, and data processing operations to fail silently.

Method Comparison

MethodReturnsStrips NewlinesMemory EfficientBest For
read().splitlines()List of linesYesNo (loads entire file)Small to medium files
readlines()List of linesNo (includes \n)No (loads entire file)When you need raw lines
[line.strip() for line in file]List of linesYesPartially (iterates lazily, stores result)General use: recommended
read().split()List of wordsYesNo (loads entire file)Word-level tokenization
Generator with yieldIterator of linesConfigurableYesVery large files

For most use cases, list comprehension with strip() offers the best balance of simplicity, readability, and clean output. For very large files, use a generator to avoid loading everything into memory at once.