How to Count the Number of Pages in a PDF File in Python

Counting the number of pages in a PDF file is a common task when building document management systems, validating uploads, automating report generation, or processing large batches of PDF documents. Python provides several libraries to work with PDFs, with PyPDF2 being one of the most popular and straightforward options.

In this guide, you'll learn how to count PDF pages using PyPDF2 and other libraries, handle common errors, and process multiple PDF files efficiently.

Installing PyPDF2

Install the library using pip:

pip install PyPDF2

Counting Pages with `len(reader.pages)`

The recommended way to count pages in modern versions of PyPDF2 is to use the pages property with len():

import PyPDF2

with open('document.pdf', 'rb') as file:
    reader = PyPDF2.PdfReader(file)
    total_pages = len(reader.pages)

print(f"Total pages: {total_pages}")

Output: (suppose that document.pdf has 10 pages)

Total pages: 10

Step-by-step breakdown:

Open the file in read binary mode ('rb'). PDF files are binary, not text.
Create a PdfReader object to parse the PDF structure.
Access reader.pages, which returns a list-like object of all pages.
Use len() to count the total number of pages.

Using with Statement

Always open PDF files with the with statement to ensure the file is properly closed after processing, even if an error occurs:

# ✅ Recommended: file is automatically closed
with open('document.pdf', 'rb') as file:
    reader = PyPDF2.PdfReader(file)
    pages = len(reader.pages)

# ❌ Avoid: must remember to close manually
file = open('document.pdf', 'rb')
reader = PyPDF2.PdfReader(file)
pages = len(reader.pages)
file.close()

Handling Errors Gracefully

PDF files can be corrupted, password-protected, or missing. Always wrap your code in error handling:

import PyPDF2

def count_pdf_pages(filepath):
    """Count the number of pages in a PDF file."""
    try:
        with open(filepath, 'rb') as file:
            reader = PyPDF2.PdfReader(file)

            if reader.is_encrypted:
                try:
                    reader.decrypt('')  # Try empty password
                except Exception:
                    return None, "PDF is encrypted and requires a password"

            return len(reader.pages), None

    except FileNotFoundError:
        return None, f"File not found: {filepath}"
    except PyPDF2.errors.PdfReadError:
        return None, "File is not a valid PDF or is corrupted"
    except Exception as e:
        return None, f"Unexpected error: {e}"


# Usage
pages, error = count_pdf_pages('document.pdf')

if error:
    print(f"Error: {error}")
else:
    print(f"Total pages: {pages}")

Output (success):

Total pages: 10

Output (file not found):

Error: File not found: document.pdf

Counting Pages in Password-Protected PDFs

If the PDF is encrypted, you need to decrypt it first:

import PyPDF2

with open('protected.pdf', 'rb') as file:
    reader = PyPDF2.PdfReader(file)

    if reader.is_encrypted:
        # Provide the password
        reader.decrypt('your_password')

    total_pages = len(reader.pages)
    print(f"Total pages: {total_pages}")

Output:

Total pages: 15

Processing Multiple PDF Files

To count pages across multiple PDF files in a directory:

import PyPDF2
import os

def count_pages_in_directory(directory):
    """Count pages in all PDF files within a directory."""
    results = []

    for filename in sorted(os.listdir(directory)):
        if filename.lower().endswith('.pdf'):
            filepath = os.path.join(directory, filename)
            try:
                with open(filepath, 'rb') as file:
                    reader = PyPDF2.PdfReader(file)
                    pages = len(reader.pages)
                    results.append((filename, pages))
            except Exception as e:
                results.append((filename, f"Error: {e}"))

    return results


# Usage
pdf_dir = '/path/to/pdf/folder'
results = count_pages_in_directory(pdf_dir)

total = 0
for filename, pages in results:
    if isinstance(pages, int):
        print(f"  {filename:30s} {pages:>5} pages")
        total += pages
    else:
        print(f"  {filename:30s} {pages}")

print(f"\n  {'TOTAL':30s} {total:>5} pages")

Output:

  annual_report.pdf                 42 pages
  contract.pdf                      12 pages
  presentation.pdf                  25 pages
  summary.pdf                        3 pages

  TOTAL                              82 pages

Using Alternative Libraries

Using `pikepdf`

pikepdf is a modern, actively maintained PDF library:

pip install pikepdf

import pikepdf

with pikepdf.open('document.pdf') as pdf:
    total_pages = len(pdf.pages)
    print(f"Total pages: {total_pages}")

Output:

Total pages: 10

Using `pdfplumber`

pdfplumber is great when you also need to extract text or tables:

pip install pdfplumber

import pdfplumber

with pdfplumber.open('document.pdf') as pdf:
    total_pages = len(pdf.pages)
    print(f"Total pages: {total_pages}")

Output:

Total pages: 10

Using `fitz` (PyMuPDF)

PyMuPDF is one of the fastest PDF libraries available:

pip install PyMuPDF

import fitz  # PyMuPDF

doc = fitz.open('document.pdf')
total_pages = doc.page_count
print(f"Total pages: {total_pages}")
doc.close()

Output:

Total pages: 10

Deprecated Method: `getNumPages()`

Deprecated API

In older versions of PyPDF2, getNumPages() was used to count pages. This method is deprecated since version 1.28.0:

# ❌ Deprecated: avoid in new code
total = reader.getNumPages()

# ✅ Use this instead
total = len(reader.pages)

If you see getNumPages() in existing code, replace it with len(reader.pages) to avoid deprecation warnings.

Quick Comparison of Libraries

Library	Installation	Speed	Handles Encrypted	Extra Features
PyPDF2	`pip install PyPDF2`	Good	✅	Merge, split, extract text
pikepdf	`pip install pikepdf`	Fast	✅	Low-level PDF manipulation
pdfplumber	`pip install pdfplumber`	Moderate	❌	Text/table extraction
PyMuPDF (fitz)	`pip install PyMuPDF`	⚡ Fastest	✅	Rendering, annotations

Conclusion

Counting pages in a PDF file with Python is straightforward using PyPDF2 or alternative libraries:

Use len(reader.pages) with PyPDF2 for a simple, reliable page count. This is the modern, recommended approach.
Handle errors with try-except to gracefully manage missing, corrupted, or encrypted files.
Use reader.decrypt() for password-protected PDFs before counting pages.
For batch processing, iterate through a directory and collect results for all PDF files.
Consider PyMuPDF (fitz) for the fastest performance, or pdfplumber if you also need text extraction.

For most use cases, PyPDF2 with len(reader.pages) is the simplest and most effective solution.

Installing PyPDF2​

Counting Pages with len(reader.pages)​

Handling Errors Gracefully​

Counting Pages in Password-Protected PDFs​

Processing Multiple PDF Files​

Using Alternative Libraries​

Using pikepdf​

Using pdfplumber​

Using fitz (PyMuPDF)​

Deprecated Method: getNumPages()​

Quick Comparison of Libraries​

Conclusion​

Table of Contents

Installing PyPDF2

Counting Pages with `len(reader.pages)`

Handling Errors Gracefully

Counting Pages in Password-Protected PDFs

Processing Multiple PDF Files

Using Alternative Libraries

Using `pikepdf`

Using `pdfplumber`

Using `fitz` (PyMuPDF)

Deprecated Method: `getNumPages()`

Quick Comparison of Libraries

Conclusion