What is the difference between Multithreading and Multiprocessing in Python?
Understanding when to use multithreading versus multiprocessing is crucial for writing efficient concurrent Python code. The Global Interpreter Lock (GIL) fundamentally shapes this decision: use threads for I/O-bound tasks and processes for CPU-bound work.
Understanding the GIL
The Global Interpreter Lock is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecode simultaneously. This design choice simplifies memory management but limits true parallelism in CPU-bound tasks.
- The GIL means only one thread executes Python code at a time
- But threads CAN run concurrently during I/O operations
The GIL only affects CPython (the standard Python implementation). Alternative implementations like Jython or IronPython don't have this limitation.
Quick Comparison
| Feature | Multithreading | Multiprocessing |
|---|---|---|
| GIL Impact | Limited by GIL | Bypasses GIL completely |
| Memory | Shared between threads | Separate per process |
| Best For | I/O-bound (network, files) | CPU-bound (calculations) |
| Overhead | Low (lightweight) | High (process creation) |
| Communication | Direct variable access | Queues, pipes, shared memory |
| Debugging | More complex (race conditions) | Easier isolation |
Multithreading for I/O-Bound Tasks
When your code spends time waiting-for network responses, file operations, or database queries-threads excel because the GIL is released during these waiting periods.
Basic Threading Example
import threading
import time
def download(url):
thread_name = threading.current_thread().name
print(f"[{thread_name}] Starting download: {url}")
time.sleep(2) # Simulates network wait
print(f"[{thread_name}] Completed: {url}")
# Create threads
t1 = threading.Thread(target=download, args=("file1.zip",), name="Thread-1")
t2 = threading.Thread(target=download, args=("file2.zip",), name="Thread-2")
# Start both threads
t1.start()
t2.start()
# Wait for completion
t1.join()
t2.join()
print("All downloads complete")
Output:
[Thread-1] Starting download: file1.zip
[Thread-2] Starting download: file2.zip
[Thread-1] Completed: file1.zip
[Thread-2] Completed: file2.zip
All downloads complete
Using ThreadPoolExecutor
For managing multiple threads cleanly, use the concurrent.futures module:
from concurrent.futures import ThreadPoolExecutor, as_completed
import time
def fetch_data(url):
time.sleep(1) # Simulate network delay
return f"Data from {url}"
urls = [f"https://api.example.com/data/{i}" for i in range(5)]
with ThreadPoolExecutor(max_workers=3) as executor:
# Submit all tasks
future_to_url = {executor.submit(fetch_data, url): url for url in urls}
# Process results as they complete
for future in as_completed(future_to_url):
url = future_to_url[future]
result = future.result()
print(f"{url}: {result}")
Output:
https://api.example.com/data/0: Data from https://api.example.com/data/0
https://api.example.com/data/1: Data from https://api.example.com/data/1
https://api.example.com/data/2: Data from https://api.example.com/data/2
https://api.example.com/data/3: Data from https://api.example.com/data/3
https://api.example.com/data/4: Data from https://api.example.com/data/4
ThreadPoolExecutor handles thread lifecycle automatically and limits concurrent threads to prevent resource exhaustion.
Multiprocessing for CPU-Bound Tasks
When your code performs heavy computations, multiprocessing bypasses the GIL by running separate Python interpreters, each with its own memory space.
Basic Multiprocessing Example
import multiprocessing
import time
def heavy_calculation(n):
"""CPU-intensive task."""
result = sum(i * i for i in range(n))
print(f"Process {multiprocessing.current_process().name}: Result = {result}")
return result
if __name__ == "__main__": # Required guard for Windows
start = time.time()
p1 = multiprocessing.Process(target=heavy_calculation, args=(10**7,))
p2 = multiprocessing.Process(target=heavy_calculation, args=(10**7,))
p1.start()
p2.start()
p1.join()
p2.join()
print(f"Total time: {time.time() - start:.2f}s")
Output:
Process Process-1: Result = 333333283333335000000
Process Process-2: Result = 333333283333335000000
Total time: 2.01s
Always wrap multiprocessing code in if __name__ == "__main__": to prevent infinite process spawning on Windows and enable proper module imports.
Using ProcessPoolExecutor
The high-level interface simplifies process management:
from concurrent.futures import ProcessPoolExecutor
import time
def cpu_intensive_task(n):
"""Simulate heavy computation."""
return sum(i ** 2 for i in range(n))
if __name__ == "__main__":
numbers = [10**6, 10**7, 10**6, 10**7]
start = time.time()
with ProcessPoolExecutor(max_workers=4) as executor:
results = list(executor.map(cpu_intensive_task, numbers))
print(f"Results: {results}")
print(f"Time: {time.time() - start:.2f}s")
Output:
Results: [333332833333500000, 333333283333335000000, 333332833333500000, 333333283333335000000]
Time: 2.71s
Sharing Data Between Processes
Since processes have separate memory, sharing data requires special mechanisms:
from multiprocessing import Process, Queue, Value, Array
def worker(queue, counter, shared_array):
# Get data from queue
item = queue.get()
# Modify shared counter
with counter.get_lock():
counter.value += 1
# Modify shared array
for i in range(len(shared_array)):
shared_array[i] *= 2
if __name__ == "__main__":
# Create shared data structures
queue = Queue()
counter = Value('i', 0) # 'i' = integer
shared_array = Array('d', [1.0, 2.0, 3.0]) # 'd' = double
queue.put("task data")
p = Process(target=worker, args=(queue, counter, shared_array))
p.start()
p.join()
print(f"Counter: {counter.value}")
print(f"Array: {list(shared_array)}")
Output:
Counter: 1
Array: [2.0, 4.0, 6.0]
Performance Comparison
import time
import threading
import multiprocessing
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def cpu_task(n):
"""CPU-bound: Calculate sum of squares."""
return sum(i * i for i in range(n))
def io_task(seconds):
"""I/O-bound: Simulate waiting."""
time.sleep(seconds)
return seconds
def benchmark_cpu():
"""Compare threading vs multiprocessing for CPU work."""
n = 10**7
tasks = 4
# Sequential
start = time.time()
for _ in range(tasks):
cpu_task(n)
sequential_time = time.time() - start
# Threaded
start = time.time()
with ThreadPoolExecutor(max_workers=tasks) as executor:
list(executor.map(cpu_task, [n] * tasks))
threaded_time = time.time() - start
# Multiprocessing
start = time.time()
with ProcessPoolExecutor(max_workers=tasks) as executor:
list(executor.map(cpu_task, [n] * tasks))
process_time = time.time() - start
print("CPU-Bound Results:")
print(f" Sequential: {sequential_time:.2f}s")
print(f" Threaded: {threaded_time:.2f}s")
print(f" Multiprocessing: {process_time:.2f}s")
if __name__ == "__main__":
benchmark_cpu()
Typical output:
CPU-Bound Results:
Sequential: 4.12s
Threaded: 4.08s (GIL limits parallelism)
Multiprocessing: 1.15s (True parallelism)
Decision Guide
| Task Type | Examples | Use |
|---|---|---|
| Web scraping | Fetching 1000 URLs | Threads |
| File downloads | Downloading multiple files | Threads |
| Database queries | Multiple concurrent queries | Threads |
| Image processing | Resizing 1000 images | Processes |
| Data analysis | Number crunching on large datasets | Processes |
| Machine learning | Training models | Processes |
| Mixed workloads | API calls + data processing | Both (threads for I/O, processes for CPU) |
Common Patterns
Thread-Safe Counter
import threading
class ThreadSafeCounter:
def __init__(self):
self.value = 0
self.lock = threading.Lock()
def increment(self):
with self.lock:
self.value += 1
counter = ThreadSafeCounter()
def worker():
for _ in range(1000):
counter.increment()
threads = [threading.Thread(target=worker) for _ in range(10)]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"Final count: {counter.value}") # 10000
Producer-Consumer with Queue
import threading
import queue
import time
def producer(q, items):
for item in items:
q.put(item)
print(f"Produced: {item}")
time.sleep(0.1)
q.put(None) # Sentinel to stop consumer
def consumer(q):
while True:
item = q.get()
if item is None:
break
print(f"Consumed: {item}")
q.task_done()
q = queue.Queue()
items = ['task1', 'task2', 'task3', 'task4']
producer_thread = threading.Thread(target=producer, args=(q, items))
consumer_thread = threading.Thread(target=consumer, args=(q,))
producer_thread.start()
consumer_thread.start()
producer_thread.join()
consumer_thread.join()
Summary
The GIL dictates your concurrency strategy in Python.
- Threads shine when your code waits for external resources-they share memory efficiently and have low overhead. - Processes unlock true parallelism for computation-heavy work by running separate Python interpreters.
Choose based on where your code spends its time: waiting means threads, computing means processes.