Skip to main content

How to Delete Pages from a PDF in Python

Removing pages from a PDF is a common task when cleaning up reports, extracting specific sections, or redacting confidential content. The PyMuPDF library (imported as fitz) provides fast and efficient PDF manipulation that handles complex documents reliably, even with large files.

In this guide, you will learn how to delete single pages, multiple pages, ranges, blank pages, and pages matching specific content criteria. Each method is explained with clear examples so you can choose the right approach for your situation.

Installation

pip install pymupdf

Deleting a Single Page

PDF pages in PyMuPDF are 0-indexed, so the first page of the document is index 0, the second is index 1, and so on:

import fitz

doc = fitz.open("document.pdf")
print(f"Pages before: {len(doc)}")

# Delete the first page (index 0)
doc.delete_page(0)

# Delete the last page using negative indexing
doc.delete_page(-1)

print(f"Pages after: {len(doc)}")

doc.save("document_modified.pdf")
doc.close()

Example output:

Pages before: 10
Pages after: 8

Deleting Multiple Specific Pages

To remove several non-consecutive pages, you must delete them in reverse order. Deleting a page shifts all subsequent page indices down by one, so working from the highest index first prevents index errors:

import fitz

doc = fitz.open("report.pdf")

# Remove pages 2, 5, and 8 in human numbering (indices 1, 4, 7)
pages_to_delete = [1, 4, 7]

for page_num in sorted(pages_to_delete, reverse=True):
doc.delete_page(page_num)

doc.save("report_cleaned.pdf")
doc.close()

A Common Mistake: Deleting in Forward Order

Deleting pages from lowest index to highest causes incorrect results because each deletion shifts the remaining indices:

import fitz

doc = fitz.open("report.pdf")

# Wrong: deleting in forward order
pages_to_delete = [1, 4, 7]

for page_num in pages_to_delete:
doc.delete_page(page_num) # After first deletion, page 4 is now page 3!

After deleting index 1, what was page 5 (index 4) is now at index 3. The loop then deletes the wrong page. Always sort in reverse order:

for page_num in sorted(pages_to_delete, reverse=True):
doc.delete_page(page_num)
warning

When deleting multiple pages, always process in reverse order (highest index first). Deleting a page shifts all subsequent page indices down by one, which causes the loop to target the wrong pages if processed in forward order.

Deleting a Range of Pages

To remove a continuous block of pages, use delete_pages() with start and end indices:

import fitz

doc = fitz.open("book.pdf")
print(f"Pages before: {len(doc)}")

# Delete pages 5 through 10 in human numbering (indices 4 through 9)
doc.delete_pages(from_page=4, to_page=9)

print(f"Pages after: {len(doc)}")

doc.save("book_shortened.pdf")
doc.close()

Example output:

Pages before: 50
Pages after: 44

Keeping Only Specific Pages

When you want to extract a small number of pages from a large document, it is easier to specify which pages to keep rather than which to delete. The select() method does exactly this:

import fitz

doc = fitz.open("presentation.pdf")

# Keep only the first page, third page, and last page (0-indexed)
pages_to_keep = [0, 2, len(doc) - 1]

doc.select(pages_to_keep)

doc.save("highlights.pdf")
doc.close()

This is also useful for extracting every Nth page:

import fitz

doc = fitz.open("document.pdf")

# Keep every other page (odd pages in human numbering)
odd_pages = list(range(0, len(doc), 2))
doc.select(odd_pages)

doc.save("odd_pages_only.pdf")
doc.close()
tip

Use doc.select() when you want to keep a small subset of a large document. Use delete_page() when you are removing just a few pages from a document you are mostly keeping.

Removing Blank Pages

Scanned documents often contain blank separator pages. You can detect and remove them by checking for text and image content:

import fitz

def is_blank_page(page, text_threshold=100):
"""Check if a page is essentially blank."""
text = page.get_text().strip()

if len(text) > text_threshold:
return False

images = page.get_images()
if images:
return False

return True

doc = fitz.open("scanned_document.pdf")

blank_pages = []
for i, page in enumerate(doc):
if is_blank_page(page):
blank_pages.append(i)

print(f"Found {len(blank_pages)} blank page(s): {blank_pages}")

# Delete blank pages in reverse order
for page_num in reversed(blank_pages):
doc.delete_page(page_num)

doc.save("document_no_blanks.pdf")
doc.close()

Example output:

Found 3 blank page(s): [2, 5, 8]

Conditional Page Deletion Based on Content

You can remove pages that contain specific text, such as draft watermarks or confidential markers:

import fitz

doc = fitz.open("report.pdf")

pages_to_delete = []

for i, page in enumerate(doc):
text = page.get_text().lower()

if "confidential" in text or "draft" in text:
pages_to_delete.append(i)
print(f"Marking page {i + 1} for deletion")

for page_num in reversed(pages_to_delete):
doc.delete_page(page_num)

print(f"\nRemoved {len(pages_to_delete)} page(s)")
doc.save("report_final.pdf")
doc.close()

Example output:

Marking page 3 for deletion
Marking page 7 for deletion

Removed 2 page(s)

Using PyPDF2 as an Alternative

If you prefer PyPDF2, the approach is slightly different. Instead of deleting pages from an existing document, you create a new document and add only the pages you want to keep:

pip install pypdf2
from PyPDF2 import PdfReader, PdfWriter

reader = PdfReader("document.pdf")
writer = PdfWriter()

# Pages to skip (0-indexed)
pages_to_delete = {0, 4, 7}

for i, page in enumerate(reader.pages):
if i not in pages_to_delete:
writer.add_page(page)

with open("document_modified.pdf", "wb") as output_file:
writer.write(output_file)

print(f"Original: {len(reader.pages)} pages")
print(f"Modified: {len(writer.pages)} pages")

Example output:

Original: 10 pages
Modified: 7 pages

Complete Reusable Utility Function

A production-ready function that handles various deletion scenarios:

import fitz

def modify_pdf_pages(
input_path: str,
output_path: str,
delete_pages: list[int] | None = None,
keep_pages: list[int] | None = None,
delete_range: tuple[int, int] | None = None
) -> int:
"""
Modify a PDF by deleting or keeping specific pages.

Args:
input_path: Source PDF file path.
output_path: Destination PDF file path.
delete_pages: List of page indices to delete (0-indexed).
keep_pages: List of page indices to keep (0-indexed).
delete_range: Tuple of (start, end) page indices to delete.

Returns:
Number of pages in the output document.
"""
doc = fitz.open(input_path)
original_count = len(doc)

if keep_pages is not None:
resolved = [p if p >= 0 else len(doc) + p for p in keep_pages]
doc.select(resolved)

elif delete_range is not None:
start, end = delete_range
doc.delete_pages(from_page=start, to_page=end)

elif delete_pages is not None:
for page_num in sorted(delete_pages, reverse=True):
if page_num < 0:
page_num = len(doc) + page_num
if 0 <= page_num < len(doc):
doc.delete_page(page_num)

doc.save(output_path)
final_count = len(doc)
doc.close()

print(f"Pages: {original_count} -> {final_count}")
return final_count

# Usage examples
modify_pdf_pages("report.pdf", "report_v2.pdf", delete_pages=[0, 5, 10])
modify_pdf_pages("book.pdf", "excerpt.pdf", keep_pages=[0, 1, -1])
modify_pdf_pages("document.pdf", "trimmed.pdf", delete_range=(10, 20))

Example output:

Pages: 25 -> 22
Pages: 100 -> 3
Pages: 50 -> 39

Method Comparison

MethodUse CasePyMuPDF Syntax
Single pageRemove one specific pagedoc.delete_page(n)
Multiple pagesRemove several non-consecutive pagesLoop with delete_page() in reverse
Page rangeRemove a continuous blockdoc.delete_pages(start, end)
Keep specific pagesExtract a subset from a large documentdoc.select([indices])
Content-basedRemove pages matching criteriaLoop, check content, delete in reverse

Conclusion

PyMuPDF provides a fast and reliable toolkit for deleting pages from PDF files.

  • Use delete_page() for removing individual pages, delete_pages() for continuous ranges, and doc.select() when it is easier to specify which pages to keep.
  • Always delete multiple pages in reverse index order to prevent index shifting bugs.
  • For content-based removal, combine get_text() with conditional logic to identify and remove pages matching specific criteria.
  • If you prefer not to use PyMuPDF, PyPDF2 offers an alternative approach by building a new document from the pages you want to keep.