How to Create a Basic Multiprocessing Program in Python

Python's Global Interpreter Lock (GIL) often limits standard threads to a single CPU core, making them inefficient for CPU-intensive tasks like data processing or complex calculations. The multiprocessing module overcomes this by creating separate processes, each with its own memory space and Python interpreter, allowing you to fully utilize multi-core processors.

This guide explains how to build a basic multiprocessing program, handle inter-process communication, and avoid common pitfalls like recursive process spawning.

Understanding Multiprocessing

Before coding, it is crucial to distinguish between Multiprocessing and Multithreading:

Multithreading: Runs multiple threads within a single process. They share memory. Best for I/O-bound tasks (waiting for network/disk).
Multiprocessing: Runs multiple independent processes. They have separate memory spaces. Best for CPU-bound tasks (heavy calculation).

Step 1: Creating and Launching Processes

To create a parallel program, you define a target function and wrap it in a multiprocessing.Process object.

The Basic Workflow:

Import multiprocessing.
Define the worker function.
Instantiate Process objects with target arguments.
Start the processes using .start().
Wait for them to finish using .join().

import multiprocessing
import time

def worker_function(number):
    """A simple task that simulates computation."""
    print(f"Worker {number}: Starting task...")
    result = number * number
    time.sleep(1) # Simulate work
    print(f"Worker {number}: Result is {result}")

if __name__ == "__main__":
    # ✅ Correct: Creating two separate processes
    # 'target' is the function to run
    # 'args' is a tuple of arguments for that function
    process1 = multiprocessing.Process(target=worker_function, args=(2,))
    process2 = multiprocessing.Process(target=worker_function, args=(3,))

    # Start the processes (Parallel execution begins here)
    process1.start()
    process2.start()

    # Wait for processes to complete before continuing main script
    process1.join()
    process2.join()
    
    print("All processes finished.")

Output:

Worker 2: Starting task...
Worker 3: Starting task...
Worker 2: Result is 4
Worker 3: Result is 9
All processes finished.

note

The args parameter must be a tuple. If passing a single argument, remember the trailing comma: args=(2,).

Step 2: Inter-Process Communication (Queues)

Because processes have separate memory spaces, they cannot share global variables like threads do. If Process A modifies a global list, Process B will not see the change. To exchange data, use a Queue.

import multiprocessing

def producer(queue):
    """Adds items to the queue."""
    print("Producer: Adding items...")
    queue.put("Hello")
    queue.put("World")

def consumer(queue):
    """Retrieves items from the queue."""
    print("Consumer: Waiting for items...")
    # .get() blocks until an item is available
    item1 = queue.get()
    item2 = queue.get()
    print(f"Consumer got: {item1} {item2}")

if __name__ == "__main__":
    # Create a thread-and-process-safe Queue
    shared_queue = multiprocessing.Queue()

    p1 = multiprocessing.Process(target=producer, args=(shared_queue,))
    p2 = multiprocessing.Process(target=consumer, args=(shared_queue,))

    p1.start()
    p2.start()

    p1.join()
    p2.join()

Output:

Producer: Adding items...
Consumer: Waiting for items...
Consumer got: Hello World

Step 3: Optimizing Process Count

Spawning too many processes can slow down your system due to context-switching overhead. Ideally, the number of active processes should match the number of available CPU cores.

import multiprocessing

def get_optimal_count():
    # ✅ Correct: Dynamically determine core count
    try:
        count = multiprocessing.cpu_count()
        print(f"Available CPU cores: {count}")
        return count
    except NotImplementedError:
        return 1 # Fallback

tip

For heavy CPU tasks, usually processes = cpu_count(). For I/O tasks, you might benefit from cpu_count() * 2 or more.

Common Error: Missing the Entry Point Guard

On Windows and macOS, Python attempts to recursively import the main script to start new processes. If you do not protect the "start" logic, the program will enter an infinite loop of spawning processes until it crashes.

Infinite Recursion Error

import multiprocessing

def worker():
    print("Working")

# ⛔️ Incorrect: This code runs immediately upon import/spawn
# On Windows, this causes a RuntimeError or infinite loop.
p = multiprocessing.Process(target=worker)
p.start()
p.join()

Solution: `if name == "main":`

import multiprocessing

def worker():
    print("Working")

# ✅ Correct: Ensures code only runs when script is executed directly
if __name__ == "__main__":
    p = multiprocessing.Process(target=worker)
    p.start()
    p.join()

Conclusion

To create a basic multiprocessing program in Python:

Use multiprocessing.Process to spawn independent tasks.
Use Queue or Pipe to pass data between processes, as they do not share global variables.
Always use if __name__ == "__main__": to prevent recursive spawning errors.
Use .join() to ensure the main program waits for background tasks to complete.

Understanding Multiprocessing​

Step 1: Creating and Launching Processes​

Step 2: Inter-Process Communication (Queues)​

Step 3: Optimizing Process Count​

Common Error: Missing the Entry Point Guard​

Infinite Recursion Error​

Solution: if __name__ == "__main__":​

Conclusion​

Table of Contents