Python NumPy: How to Convert an Image to a NumPy Array and Save It to CSV in Python
Converting images to NumPy arrays is a fundamental operation in image processing, computer vision, and machine learning. Once an image is represented as a NumPy array, you can manipulate its pixel values, apply transformations, or save the data to CSV format for storage, analysis, or sharing.
In this guide, you'll learn how to read an image into a NumPy array using different libraries, understand image array shapes, and save/load image data to and from CSV files.
Understanding Image Array Dimensionsā
Before diving into the conversion, it's important to understand how images are represented as arrays:
- Grayscale images produce a 2D array with shape
(height, width), where each value (0ā255) represents pixel brightness. - Color (RGB) images produce a 3D array with shape
(height, width, 3), where the three channels represent Red, Green, and Blue intensities.
Grayscale: (height, width) ā e.g., (251, 335)
RGB: (height, width, 3) ā e.g., (251, 335, 3)
Reading an Image into a NumPy Arrayā
Using PIL (Pillow) and NumPyā
from PIL import Image
import numpy as np
# Read the image
img = Image.open('photo.jpg')
# Convert to NumPy array
img_array = np.asarray(img)
print(f"Shape: {img_array.shape}")
print(f"Dtype: {img_array.dtype}")
print(f"Min: {img_array.min()}, Max: {img_array.max()}")
Output:
Shape: (251, 335, 3)
Dtype: uint8
Min: 0, Max: 255
Using Matplotlibā
import matplotlib.image as mpimg
img_array = mpimg.imread('photo.jpg')
print(f"Shape: {img_array.shape}")
print(f"Dtype: {img_array.dtype}")
Output:
Shape: (251, 335, 3)
Dtype: uint8
Using OpenCVā
import cv2
# Note: OpenCV reads images in BGR format, not RGB
img_array = cv2.imread('photo.jpg')
print(f"Shape: {img_array.shape}")
print(f"Dtype: {img_array.dtype}")
# Convert BGR to RGB if needed
img_rgb = cv2.cvtColor(img_array, cv2.COLOR_BGR2RGB)
Output:
Shape: (251, 335, 3)
Dtype: uint8
OpenCV reads color images in BGR format (Blue, Green, Red) instead of the standard RGB. Use cv2.cvtColor() to convert if you need RGB order.
Saving an Image Array to CSVā
CSV files can only store 2D data (rows and columns). Since color images are 3D arrays, you need to reshape the array to 2D before saving.
Method 1: Using NumPy (savetxt / loadtxt)ā
import numpy as np
from PIL import Image
# Read the image
img = Image.open('photo.jpg')
img_array = np.asarray(img)
print(f"Original shape: {img_array.shape}")
# Reshape 3D array to 2D for CSV storage
if img_array.ndim == 3:
# Flatten the last two dimensions: (height, width * channels)
img_2d = img_array.reshape(img_array.shape[0], -1)
print(f"Reshaped to 2D: {img_2d.shape}")
else:
img_2d = img_array # Grayscale is already 2D
# Save to CSV
np.savetxt('image_data.csv', img_2d, delimiter=',', fmt='%d')
print("Saved to image_data.csv")
Output:
Original shape: (251, 335, 3)
Reshaped to 2D: (251, 1005)
Saved to image_data.csv
How the reshape works:
A (251, 335, 3) image becomes (251, 1005) because each row of 335 pixels Ć 3 channels = 1005 values. The pixel data is flattened as: [R1, G1, B1, R2, G2, B2, ..., R335, G335, B335] for each row.
Method 2: Using Pandasā
import numpy as np
import pandas as pd
from PIL import Image
# Read the image
img = Image.open('photo.jpg')
img_array = np.asarray(img)
# Reshape to 2D
if img_array.ndim == 3:
img_2d = img_array.reshape(img_array.shape[0], -1)
else:
img_2d = img_array
# Convert to DataFrame and save
df = pd.DataFrame(img_2d)
df.to_csv('image_data.csv', header=False, index=False)
print(f"Saved {img_2d.shape} array to CSV")
Output:
Saved (251, 1005) array to CSV
Loading Image Data from CSVā
Using NumPyā
import numpy as np
# Load the 2D data from CSV
loaded_2d = np.loadtxt('image_data.csv', delimiter=',')
# Define original image dimensions
original_height = 251
original_width = 335
num_channels = 3
# Reshape back to 3D
loaded_image = loaded_2d.reshape(original_height, original_width, num_channels)
loaded_image = loaded_image.astype(np.uint8)
print(f"Loaded shape: {loaded_image.shape}")
print(f"Dtype: {loaded_image.dtype}")
Output:
Loaded shape: (251, 335, 3)
Dtype: uint8
Using Pandasā
import numpy as np
import pandas as pd
# Load from CSV
loaded_df = pd.read_csv('image_data.csv', header=None)
loaded_2d = loaded_df.values
# Reshape back to original dimensions
original_channels = 3
original_width = loaded_2d.shape[1] // original_channels
loaded_image = loaded_2d.reshape(loaded_2d.shape[0], original_width, original_channels)
loaded_image = loaded_image.astype(np.uint8)
print(f"Loaded shape: {loaded_image.shape}")
Output:
Loaded shape: (251, 335, 3)
Complete Workflow: Save and Verifyā
Here's a complete, reusable workflow that saves an image to CSV and verifies the data integrity upon loading:
import numpy as np
from PIL import Image
def image_to_csv(image_path, csv_path):
"""Convert an image to a NumPy array and save to CSV."""
img = Image.open(image_path)
img_array = np.asarray(img)
original_shape = img_array.shape
print(f"Original shape: {original_shape}")
# Reshape to 2D if needed
if img_array.ndim == 3:
img_2d = img_array.reshape(img_array.shape[0], -1)
else:
img_2d = img_array
# Save to CSV with integer format
np.savetxt(csv_path, img_2d, delimiter=',', fmt='%d')
print(f"Saved to {csv_path} (shape: {img_2d.shape})")
return original_shape
def csv_to_image(csv_path, original_shape):
"""Load image data from CSV and reconstruct the array."""
loaded_2d = np.loadtxt(csv_path, delimiter=',')
# Reshape back to original dimensions
loaded_image = loaded_2d.reshape(original_shape).astype(np.uint8)
print(f"Loaded shape: {loaded_image.shape}")
return loaded_image
# Save
shape = image_to_csv('photo.jpg', 'image_data.csv')
# Load
loaded = csv_to_image('image_data.csv', shape)
# Verify
original = np.asarray(Image.open('photo.jpg'))
if np.array_equal(original, loaded):
print("\nā
Data integrity verified: loaded data matches original.")
else:
print("\nā Data mismatch detected.")
Output:
Original shape: (251, 335, 3)
Saved to image_data.csv (shape: (251, 1005))
Loaded shape: (251, 335, 3)