How to Delete Pages from a PDF in Python

Removing pages from a PDF is a common task when cleaning up reports, extracting specific sections, or redacting confidential content. The PyMuPDF library (imported as fitz) provides fast and efficient PDF manipulation that handles complex documents reliably, even with large files.

In this guide, you will learn how to delete single pages, multiple pages, ranges, blank pages, and pages matching specific content criteria. Each method is explained with clear examples so you can choose the right approach for your situation.

Installation

pip install pymupdf

Deleting a Single Page

PDF pages in PyMuPDF are 0-indexed, so the first page of the document is index 0, the second is index 1, and so on:

import fitz

doc = fitz.open("document.pdf")
print(f"Pages before: {len(doc)}")

# Delete the first page (index 0)
doc.delete_page(0)

# Delete the last page using negative indexing
doc.delete_page(-1)

print(f"Pages after: {len(doc)}")

doc.save("document_modified.pdf")
doc.close()

Example output:

Pages before: 10
Pages after: 8

Deleting Multiple Specific Pages

To remove several non-consecutive pages, you must delete them in reverse order. Deleting a page shifts all subsequent page indices down by one, so working from the highest index first prevents index errors:

import fitz

doc = fitz.open("report.pdf")

# Remove pages 2, 5, and 8 in human numbering (indices 1, 4, 7)
pages_to_delete = [1, 4, 7]

for page_num in sorted(pages_to_delete, reverse=True):
    doc.delete_page(page_num)

doc.save("report_cleaned.pdf")
doc.close()

A Common Mistake: Deleting in Forward Order

Deleting pages from lowest index to highest causes incorrect results because each deletion shifts the remaining indices:

import fitz

doc = fitz.open("report.pdf")

# Wrong: deleting in forward order
pages_to_delete = [1, 4, 7]

for page_num in pages_to_delete:
    doc.delete_page(page_num)  # After first deletion, page 4 is now page 3!

After deleting index 1, what was page 5 (index 4) is now at index 3. The loop then deletes the wrong page. Always sort in reverse order:

for page_num in sorted(pages_to_delete, reverse=True):
    doc.delete_page(page_num)

warning

When deleting multiple pages, always process in reverse order (highest index first). Deleting a page shifts all subsequent page indices down by one, which causes the loop to target the wrong pages if processed in forward order.

Deleting a Range of Pages

To remove a continuous block of pages, use delete_pages() with start and end indices:

import fitz

doc = fitz.open("book.pdf")
print(f"Pages before: {len(doc)}")

# Delete pages 5 through 10 in human numbering (indices 4 through 9)
doc.delete_pages(from_page=4, to_page=9)

print(f"Pages after: {len(doc)}")

doc.save("book_shortened.pdf")
doc.close()

Example output:

Pages before: 50
Pages after: 44

Keeping Only Specific Pages

When you want to extract a small number of pages from a large document, it is easier to specify which pages to keep rather than which to delete. The select() method does exactly this:

import fitz

doc = fitz.open("presentation.pdf")

# Keep only the first page, third page, and last page (0-indexed)
pages_to_keep = [0, 2, len(doc) - 1]

doc.select(pages_to_keep)

doc.save("highlights.pdf")
doc.close()

This is also useful for extracting every Nth page:

import fitz

doc = fitz.open("document.pdf")

# Keep every other page (odd pages in human numbering)
odd_pages = list(range(0, len(doc), 2))
doc.select(odd_pages)

doc.save("odd_pages_only.pdf")
doc.close()

tip

Use doc.select() when you want to keep a small subset of a large document. Use delete_page() when you are removing just a few pages from a document you are mostly keeping.

Removing Blank Pages

Scanned documents often contain blank separator pages. You can detect and remove them by checking for text and image content:

import fitz

def is_blank_page(page, text_threshold=100):
    """Check if a page is essentially blank."""
    text = page.get_text().strip()

    if len(text) > text_threshold:
        return False

    images = page.get_images()
    if images:
        return False

    return True

doc = fitz.open("scanned_document.pdf")

blank_pages = []
for i, page in enumerate(doc):
    if is_blank_page(page):
        blank_pages.append(i)

print(f"Found {len(blank_pages)} blank page(s): {blank_pages}")

# Delete blank pages in reverse order
for page_num in reversed(blank_pages):
    doc.delete_page(page_num)

doc.save("document_no_blanks.pdf")
doc.close()

Example output:

Found 3 blank page(s): [2, 5, 8]

Conditional Page Deletion Based on Content

You can remove pages that contain specific text, such as draft watermarks or confidential markers:

import fitz

doc = fitz.open("report.pdf")

pages_to_delete = []

for i, page in enumerate(doc):
    text = page.get_text().lower()

    if "confidential" in text or "draft" in text:
        pages_to_delete.append(i)
        print(f"Marking page {i + 1} for deletion")

for page_num in reversed(pages_to_delete):
    doc.delete_page(page_num)

print(f"\nRemoved {len(pages_to_delete)} page(s)")
doc.save("report_final.pdf")
doc.close()

Example output:

Marking page 3 for deletion
Marking page 7 for deletion

Removed 2 page(s)

Using PyPDF2 as an Alternative

If you prefer PyPDF2, the approach is slightly different. Instead of deleting pages from an existing document, you create a new document and add only the pages you want to keep:

pip install pypdf2

from PyPDF2 import PdfReader, PdfWriter

reader = PdfReader("document.pdf")
writer = PdfWriter()

# Pages to skip (0-indexed)
pages_to_delete = {0, 4, 7}

for i, page in enumerate(reader.pages):
    if i not in pages_to_delete:
        writer.add_page(page)

with open("document_modified.pdf", "wb") as output_file:
    writer.write(output_file)

print(f"Original: {len(reader.pages)} pages")
print(f"Modified: {len(writer.pages)} pages")

Example output:

Original: 10 pages
Modified: 7 pages

Complete Reusable Utility Function

A production-ready function that handles various deletion scenarios:

import fitz

def modify_pdf_pages(
    input_path: str,
    output_path: str,
    delete_pages: list[int] | None = None,
    keep_pages: list[int] | None = None,
    delete_range: tuple[int, int] | None = None
) -> int:
    """
    Modify a PDF by deleting or keeping specific pages.

    Args:
        input_path: Source PDF file path.
        output_path: Destination PDF file path.
        delete_pages: List of page indices to delete (0-indexed).
        keep_pages: List of page indices to keep (0-indexed).
        delete_range: Tuple of (start, end) page indices to delete.

    Returns:
        Number of pages in the output document.
    """
    doc = fitz.open(input_path)
    original_count = len(doc)

    if keep_pages is not None:
        resolved = [p if p >= 0 else len(doc) + p for p in keep_pages]
        doc.select(resolved)

    elif delete_range is not None:
        start, end = delete_range
        doc.delete_pages(from_page=start, to_page=end)

    elif delete_pages is not None:
        for page_num in sorted(delete_pages, reverse=True):
            if page_num < 0:
                page_num = len(doc) + page_num
            if 0 <= page_num < len(doc):
                doc.delete_page(page_num)

    doc.save(output_path)
    final_count = len(doc)
    doc.close()

    print(f"Pages: {original_count} -> {final_count}")
    return final_count

# Usage examples
modify_pdf_pages("report.pdf", "report_v2.pdf", delete_pages=[0, 5, 10])
modify_pdf_pages("book.pdf", "excerpt.pdf", keep_pages=[0, 1, -1])
modify_pdf_pages("document.pdf", "trimmed.pdf", delete_range=(10, 20))

Example output:

Pages: 25 -> 22
Pages: 100 -> 3
Pages: 50 -> 39

Method Comparison

Method	Use Case	PyMuPDF Syntax
Single page	Remove one specific page	`doc.delete_page(n)`
Multiple pages	Remove several non-consecutive pages	Loop with `delete_page()` in reverse
Page range	Remove a continuous block	`doc.delete_pages(start, end)`
Keep specific pages	Extract a subset from a large document	`doc.select([indices])`
Content-based	Remove pages matching criteria	Loop, check content, delete in reverse

Conclusion

PyMuPDF provides a fast and reliable toolkit for deleting pages from PDF files.

Use delete_page() for removing individual pages, delete_pages() for continuous ranges, and doc.select() when it is easier to specify which pages to keep.
Always delete multiple pages in reverse index order to prevent index shifting bugs.
For content-based removal, combine get_text() with conditional logic to identify and remove pages matching specific criteria.
If you prefer not to use PyMuPDF, PyPDF2 offers an alternative approach by building a new document from the pages you want to keep.

Installation​

Deleting a Single Page​

Deleting Multiple Specific Pages​

A Common Mistake: Deleting in Forward Order​

Deleting a Range of Pages​

Keeping Only Specific Pages​

Removing Blank Pages​

Conditional Page Deletion Based on Content​

Using PyPDF2 as an Alternative​

Complete Reusable Utility Function​

Method Comparison​

Conclusion​

Table of Contents

Installation

Deleting a Single Page

Deleting Multiple Specific Pages

A Common Mistake: Deleting in Forward Order

Deleting a Range of Pages

Keeping Only Specific Pages

Removing Blank Pages

Conditional Page Deletion Based on Content

Using PyPDF2 as an Alternative

Complete Reusable Utility Function

Method Comparison

Conclusion