Skip to main content

Python PyMongo: How to Insert Documents in MongoDB Using PyMongo

MongoDB's document-based structure makes it ideal for storing flexible, JSON-like data in Python applications. Whether you are building user registration systems, logging events, or migrating datasets, understanding proper document insertion is fundamental to working with MongoDB effectively.

In this guide, you will learn how to insert single documents and bulk datasets using PyMongo's modern insertion methods, handle common errors like duplicate keys, control insertion order, and build production-ready insertion functions with proper error handling.

Prerequisites and Setup

Install PyMongo if you have not already:

pip install pymongo

Establish a connection to your MongoDB instance before inserting documents:

from pymongo import MongoClient

# Connect to MongoDB (local instance)
client = MongoClient("mongodb://localhost:27017/")

# Access a database and collection
db = client["my_application"]
users_collection = db["users"]

If the database or collection does not exist yet, MongoDB creates them automatically when you insert the first document.

Inserting a Single Document

Use insert_one() to add a single document to a collection. The document is defined as a standard Python dictionary:

new_user = {
"username": "alice_smith",
"email": "alice@example.com",
"age": 28,
"active": True
}

result = users_collection.insert_one(new_user)

print(f"Document inserted with ID: {result.inserted_id}")
print(f"Acknowledged: {result.acknowledged}")

Example output:

Document inserted with ID: 665a1b2c3d4e5f6a7b8c9d0e
Acknowledged: True

The insert_one() method returns an InsertOneResult object. The inserted_id attribute contains the unique identifier assigned to the new document, and acknowledged confirms that the server received the write operation.

Automatic ID Generation

MongoDB automatically generates a unique ObjectId for the _id field if you do not provide one. This 12-byte identifier ensures uniqueness across distributed systems without requiring coordination between servers.

Bulk Insertion with insert_many()

When adding multiple documents, insert_many() is significantly more efficient than calling insert_one() in a loop. It sends all documents to the server in a single network request:

products = [
{"name": "Laptop", "price": 999.99, "stock": 50},
{"name": "Mouse", "price": 29.99, "stock": 200},
{"name": "Keyboard", "price": 79.99, "stock": 150},
{"name": "Monitor", "price": 349.99, "stock": 75}
]

result = db["products"].insert_many(products)

print(f"Inserted {len(result.inserted_ids)} documents")
print(f"IDs: {result.inserted_ids}")

Example output:

Inserted 4 documents
IDs: [ObjectId('665a...'), ObjectId('665a...'), ObjectId('665a...'), ObjectId('665a...')]

Why Not Loop with insert_one()?

A common mistake is inserting multiple documents one at a time in a loop:

# Slow: one network round-trip per document
for product in products:
db["products"].insert_one(product)

# Fast: one network round-trip for all documents
db["products"].insert_many(products)

Each insert_one() call requires a separate network round-trip to the database server. For 1,000 documents, that means 1,000 round-trips versus just one with insert_many().

Controlling Insertion Order

The ordered parameter determines how MongoDB handles failures during bulk insertion:

# ordered=True (default): stops at the first error
result = db["products"].insert_many(products, ordered=True)

# ordered=False: continues inserting remaining documents after errors
result = db["products"].insert_many(products, ordered=False)
When to Use Unordered Inserts

Set ordered=False when importing data that might contain duplicates or when individual document failures should not stop the entire batch. This is particularly useful for data migrations and log ingestion where partial success is acceptable. With ordered=True, a failure on the third document would prevent documents four through the end from being inserted.

Handling Duplicate Key Errors

Attempting to insert a document with an _id that already exists raises a DuplicateKeyError. Handle this gracefully to prevent your application from crashing:

from pymongo.errors import DuplicateKeyError, BulkWriteError

# Single document duplicate handling
try:
users_collection.insert_one({"_id": 1, "name": "Alice"})
users_collection.insert_one({"_id": 1, "name": "Bob"}) # Same _id
except DuplicateKeyError:
print("Error: Document with this _id already exists")

Output:

Error: Document with this _id already exists

For bulk insertions, use BulkWriteError to inspect which documents succeeded and which failed:

from pymongo.errors import BulkWriteError

try:
db["products"].insert_many(products, ordered=False)
except BulkWriteError as e:
print(f"Successfully inserted: {e.details['nInserted']}")
print(f"Errors: {len(e.details['writeErrors'])}")

Inserting Documents with Custom IDs

While auto-generated ObjectId values are recommended for most use cases, you can specify custom _id values of any type:

# Using a custom string ID
user_with_custom_id = {
"_id": "user_alice_001",
"name": "Alice",
"department": "Engineering"
}

result = users_collection.insert_one(user_with_custom_id)
print(f"Inserted with custom ID: {result.inserted_id}")

# Using an integer ID (common when migrating from SQL databases)
legacy_record = {
"_id": 1001,
"legacy_field": "imported_data"
}

result = users_collection.insert_one(legacy_record)
print(f"Inserted with integer ID: {result.inserted_id}")

Output:

Inserted with custom ID: user_alice_001
Inserted with integer ID: 1001
Deprecated Method

The legacy .insert() method from older PyMongo versions is deprecated and was removed in PyMongo 4.0. Always use insert_one() or insert_many() for forward compatibility and proper result handling.

Complete Example with Error Handling

Here is a production-ready example that wraps insertion logic in reusable functions with comprehensive error handling:

from pymongo import MongoClient
from pymongo.errors import DuplicateKeyError, BulkWriteError, ConnectionFailure

def insert_user(collection, user_data):
"""Safely insert a single user document."""
try:
result = collection.insert_one(user_data)
return {"success": True, "id": str(result.inserted_id)}
except DuplicateKeyError:
return {"success": False, "error": "User already exists"}
except ConnectionFailure:
return {"success": False, "error": "Database connection failed"}

def bulk_insert_users(collection, users_list):
"""Insert multiple users with partial failure handling."""
try:
result = collection.insert_many(users_list, ordered=False)
return {"success": True, "inserted_count": len(result.inserted_ids)}
except BulkWriteError as e:
return {
"success": False,
"inserted_count": e.details["nInserted"],
"errors": len(e.details["writeErrors"])
}

# Usage
client = MongoClient("mongodb://localhost:27017/")
users = client["app"]["users"]

# Single insert
result = insert_user(users, {"username": "bob", "email": "bob@example.com"})
print(result)

# Bulk insert
new_users = [
{"username": "carol", "email": "carol@example.com"},
{"username": "dave", "email": "dave@example.com"}
]
result = bulk_insert_users(users, new_users)
print(result)

Example output:

{'success': True, 'id': '665a1b2c3d4e5f6a7b8c9d0e'}
{'success': True, 'inserted_count': 2}

Method Comparison

MethodUse CaseReturnsNetwork Calls
insert_one()Single documentInsertOneResult1
insert_many()Multiple documentsInsertManyResult1
insert_many(ordered=False)Bulk with partial failure toleranceInsertManyResult or BulkWriteError1

Conclusion

  • Use insert_one() for adding individual documents
  • Use insert_many() for batch operations to minimize network round-trips.
  • Set ordered=False when partial failures are acceptable and you want the insertion to continue past errors.
  • Always wrap insertion calls in try/except blocks to handle DuplicateKeyError and BulkWriteError gracefully.

Let MongoDB auto-generate ObjectId values unless you have a specific reason to use custom IDs, such as migrating from a relational database with existing identifiers.