How to Download an Image from a URL in Python
Downloading images programmatically from the web is a common task in web scraping, data collection, automation, and machine learning pipelines. Python makes this straightforward with several built-in and third-party libraries.
This guide covers multiple methods to download an image from a URL in Python, including urllib.request, requests, and aiohttp for asynchronous downloads. Each approach includes complete, runnable examples with explanations so you can choose the best fit for your project.
Prerequisites
Before you start, install the libraries used throughout this guide:
pip install requests pillow aiohttp
requests: a popular HTTP library for making web requests.Pillow: a fork of PIL used here to verify downloaded images (optional).aiohttp: an asynchronous HTTP client for non-blocking downloads (used in the advanced section).
urllib.request is part of Python's standard library and requires no installation.
Method 1: Using urllib.request.urlretrieve()
The simplest way to download an image is with urllib.request.urlretrieve(). It takes a URL and a local filename, then saves the resource directly to disk.
import urllib.request
url = "https://via.placeholder.com/300x200.png"
output_file = "downloaded_image.png"
# Download the image and save it locally
urllib.request.urlretrieve(url, output_file)
print(f"Image saved as {output_file}")
Output:
Image saved as downloaded_image.png
How It Works
urlretrieve(url, filename)sends an HTTP GET request to the URL.- The response body (the image data) is written directly to the specified file.
- The function returns a tuple of
(filename, headers), which you can inspect if needed.
urlretrieve() is considered a legacy interface by the Python documentation. It works well for simple use cases, but for more control over headers, timeouts, and error handling, the requests library (Method 2) is recommended.
Important: Know the File Extension
You need to specify the correct file extension when saving. If you save a PNG image as .jpg, some image viewers may still open it, but it can cause issues in processing pipelines.
# ❌ Potentially problematic: saving a PNG with .jpg extension
urllib.request.urlretrieve(url, "image.jpg")
# ✅ Better: match the extension to the actual format
urllib.request.urlretrieve(url, "image.png")
Method 2: Using the requests Library
The requests library provides a cleaner, more Pythonic interface with better error handling, timeout support, and header customization.
import requests
url = "https://via.placeholder.com/300x200.png"
output_file = "downloaded_image.png"
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
with open(output_file, "wb") as f:
f.write(response.content)
print(f"Image saved as {output_file}")
else:
print(f"Failed to download. Status code: {response.status_code}")
Output:
Image saved as downloaded_image.png
How It Works
requests.get(url)sends a GET request and stores the full response.response.contentcontains the raw binary data of the image.- The data is written to a file opened in binary write mode (
"wb").
Downloading Large Images with Streaming
For large files, loading the entire image into memory at once can be wasteful. Use stream=True to download in chunks:
import requests
url = "https://via.placeholder.com/3000x2000.png"
output_file = "large_image.png"
response = requests.get(url, stream=True)
if response.status_code == 200:
with open(output_file, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Large image saved as {output_file}")
else:
print(f"Download failed: {response.status_code}")
stream=Trueprevents the entire response from being loaded into memory immediately.iter_content(chunk_size=8192)reads the data in 8 KB chunks, keeping memory usage low.
Streaming is especially important when downloading high-resolution images or batches of files where total memory consumption matters.
Method 3: Using urllib.request.urlopen() with Manual File Writing
For more control than urlretrieve() offers, without installing third-party packages, you can use urlopen() and write the data manually:
import urllib.request
url = "https://via.placeholder.com/300x200.png"
output_file = "downloaded_image.png"
with urllib.request.urlopen(url) as response:
image_data = response.read()
with open(output_file, "wb") as f:
f.write(image_data)
print(f"Image saved as {output_file}")
Output:
Image saved as downloaded_image.png
This approach gives you access to the response object, so you can inspect headers (e.g., Content-Type) before saving.
Detecting the Correct File Extension Automatically
If you don't know the image format in advance, you can extract it from the URL or from the response headers:
From the URL
from urllib.parse import urlparse
from pathlib import Path
url = "https://via.placeholder.com/300x200.png"
parsed = urlparse(url)
extension = Path(parsed.path).suffix # '.png'
print(f"Detected extension: {extension}")
Output:
Detected extension: .png
From the Response Headers
import requests
url = "https://via.placeholder.com/300x200.png"
response = requests.head(url)
content_type = response.headers.get("Content-Type", "")
print(f"Content-Type: {content_type}")
# Map common MIME types to extensions
mime_to_ext = {
"image/png": ".png",
"image/jpeg": ".jpg",
"image/gif": ".gif",
"image/webp": ".webp",
"image/svg+xml": ".svg",
}
extension = mime_to_ext.get(content_type, ".bin")
print(f"File extension: {extension}")
Output:
Content-Type: image/png
File extension: .png
Adding Error Handling
Production code should handle common failure scenarios like network errors, invalid URLs, timeouts, and HTTP errors:
import requests
def download_image(url, output_file, timeout=10):
try:
response = requests.get(url, timeout=timeout, stream=True)
response.raise_for_status() # Raises HTTPError for 4xx/5xx responses
with open(output_file, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Image saved as {output_file}")
except requests.exceptions.MissingSchema:
print(f"Invalid URL: {url}")
except requests.exceptions.ConnectionError:
print(f"Failed to connect to {url}")
except requests.exceptions.Timeout:
print(f"Request timed out after {timeout} seconds")
except requests.exceptions.HTTPError as e:
print(f"HTTP error: {e}")
# Usage
download_image("https://via.placeholder.com/300x200.png", "image.png")
download_image("https://invalid-url-example.xyz/image.png", "fail.png")
Output:
Image saved as image.png
Failed to connect to https://invalid-url-example.xyz/image.png
Without response.raise_for_status() or a status code check, your code may silently save an HTML error page as an image file when the server returns a 404 or 500 response:
# ❌ Saves error page HTML as "image.png" without any warning
response = requests.get("https://example.com/nonexistent.png")
with open("image.png", "wb") as f:
f.write(response.content)
Always verify the response before writing to disk.
Verifying the Downloaded Image
To confirm the download was successful, you can open the image with Pillow:
from PIL import Image
img = Image.open("downloaded_image.png")
print(f"Format: {img.format}")
print(f"Size: {img.size}")
print(f"Mode: {img.mode}")
img.show() # Opens the image in your default viewer
Output:
Format: PNG
Size: (300, 200)
Mode: RGB
Bonus: Downloading Multiple Images
When you need to download several images, loop through a list of URLs and save each one:
import requests
from pathlib import Path
urls = [
"https://via.placeholder.com/100x100.png",
"https://via.placeholder.com/200x200.png",
"https://via.placeholder.com/300x300.png",
]
output_dir = Path("images")
output_dir.mkdir(exist_ok=True)
for i, url in enumerate(urls):
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
file_path = output_dir / f"image_{i + 1}.png"
file_path.write_bytes(response.content)
print(f"Saved: {file_path}")
except requests.exceptions.RequestException as e:
print(f"Failed to download {url}: {e}")
Output:
Saved: images/image_1.png
Saved: images/image_2.png
Saved: images/image_3.png
For downloading many images at scale, consider using concurrent.futures.ThreadPoolExecutor or an async library like aiohttp to download multiple files in parallel, significantly reducing total download time.
Quick Comparison of Methods
| Method | External Dependency | Streaming Support | Error Handling | Best For |
|---|---|---|---|---|
urllib.request.urlretrieve() | None | ❌ | Basic | Quick, simple downloads |
urllib.request.urlopen() | None | Manual | Basic | Standard library only |
requests.get() | requests | ✅ (stream=True) | Excellent | Most use cases |
aiohttp | aiohttp | ✅ | Excellent | Async / bulk downloads |
Conclusion
Python offers several reliable ways to download images from URLs:
urllib.request.urlretrieve()is the quickest one-liner for simple downloads using only the standard library.requests.get()is the recommended approach for most projects thanks to its clean API, streaming support, and robust error handling.urllib.request.urlopen()gives you manual control without third-party dependencies.
For production code, always add error handling, status code checks, and timeouts to ensure your downloads are reliable. When dealing with large images or bulk downloads, use streaming and concurrency to keep memory usage low and performance high.