Skip to main content

Python NumPy: How to Convert an Image to a NumPy Array and Save It to CSV in Python

Converting images to NumPy arrays is a fundamental operation in image processing, computer vision, and machine learning. Once an image is represented as a NumPy array, you can manipulate its pixel values, apply transformations, or save the data to CSV format for storage, analysis, or sharing.

In this guide, you'll learn how to read an image into a NumPy array using different libraries, understand image array shapes, and save/load image data to and from CSV files.

Understanding Image Array Dimensions​

Before diving into the conversion, it's important to understand how images are represented as arrays:

  • Grayscale images produce a 2D array with shape (height, width), where each value (0–255) represents pixel brightness.
  • Color (RGB) images produce a 3D array with shape (height, width, 3), where the three channels represent Red, Green, and Blue intensities.
Grayscale: (height, width)     → e.g., (251, 335)
RGB: (height, width, 3) → e.g., (251, 335, 3)

Reading an Image into a NumPy Array​

Using PIL (Pillow) and NumPy​

from PIL import Image
import numpy as np

# Read the image
img = Image.open('photo.jpg')

# Convert to NumPy array
img_array = np.asarray(img)

print(f"Shape: {img_array.shape}")
print(f"Dtype: {img_array.dtype}")
print(f"Min: {img_array.min()}, Max: {img_array.max()}")

Output:

Shape: (251, 335, 3)
Dtype: uint8
Min: 0, Max: 255

Using Matplotlib​

import matplotlib.image as mpimg

img_array = mpimg.imread('photo.jpg')

print(f"Shape: {img_array.shape}")
print(f"Dtype: {img_array.dtype}")

Output:

Shape: (251, 335, 3)
Dtype: uint8

Using OpenCV​

import cv2

# Note: OpenCV reads images in BGR format, not RGB
img_array = cv2.imread('photo.jpg')

print(f"Shape: {img_array.shape}")
print(f"Dtype: {img_array.dtype}")

# Convert BGR to RGB if needed
img_rgb = cv2.cvtColor(img_array, cv2.COLOR_BGR2RGB)

Output:

Shape: (251, 335, 3)
Dtype: uint8
note

OpenCV reads color images in BGR format (Blue, Green, Red) instead of the standard RGB. Use cv2.cvtColor() to convert if you need RGB order.

Saving an Image Array to CSV​

CSV files can only store 2D data (rows and columns). Since color images are 3D arrays, you need to reshape the array to 2D before saving.

Method 1: Using NumPy (savetxt / loadtxt)​

import numpy as np
from PIL import Image

# Read the image
img = Image.open('photo.jpg')
img_array = np.asarray(img)
print(f"Original shape: {img_array.shape}")

# Reshape 3D array to 2D for CSV storage
if img_array.ndim == 3:
# Flatten the last two dimensions: (height, width * channels)
img_2d = img_array.reshape(img_array.shape[0], -1)
print(f"Reshaped to 2D: {img_2d.shape}")
else:
img_2d = img_array # Grayscale is already 2D

# Save to CSV
np.savetxt('image_data.csv', img_2d, delimiter=',', fmt='%d')
print("Saved to image_data.csv")

Output:

Original shape: (251, 335, 3)
Reshaped to 2D: (251, 1005)
Saved to image_data.csv

How the reshape works:

A (251, 335, 3) image becomes (251, 1005) because each row of 335 pixels Ɨ 3 channels = 1005 values. The pixel data is flattened as: [R1, G1, B1, R2, G2, B2, ..., R335, G335, B335] for each row.

Method 2: Using Pandas​

import numpy as np
import pandas as pd
from PIL import Image

# Read the image
img = Image.open('photo.jpg')
img_array = np.asarray(img)

# Reshape to 2D
if img_array.ndim == 3:
img_2d = img_array.reshape(img_array.shape[0], -1)
else:
img_2d = img_array

# Convert to DataFrame and save
df = pd.DataFrame(img_2d)
df.to_csv('image_data.csv', header=False, index=False)
print(f"Saved {img_2d.shape} array to CSV")

Output:

Saved (251, 1005) array to CSV

Loading Image Data from CSV​

Using NumPy​

import numpy as np

# Load the 2D data from CSV
loaded_2d = np.loadtxt('image_data.csv', delimiter=',')

# Define original image dimensions
original_height = 251
original_width = 335
num_channels = 3

# Reshape back to 3D
loaded_image = loaded_2d.reshape(original_height, original_width, num_channels)
loaded_image = loaded_image.astype(np.uint8)

print(f"Loaded shape: {loaded_image.shape}")
print(f"Dtype: {loaded_image.dtype}")

Output:

Loaded shape: (251, 335, 3)
Dtype: uint8

Using Pandas​

import numpy as np
import pandas as pd

# Load from CSV
loaded_df = pd.read_csv('image_data.csv', header=None)
loaded_2d = loaded_df.values

# Reshape back to original dimensions
original_channels = 3
original_width = loaded_2d.shape[1] // original_channels

loaded_image = loaded_2d.reshape(loaded_2d.shape[0], original_width, original_channels)
loaded_image = loaded_image.astype(np.uint8)

print(f"Loaded shape: {loaded_image.shape}")

Output:

Loaded shape: (251, 335, 3)

Complete Workflow: Save and Verify​

Here's a complete, reusable workflow that saves an image to CSV and verifies the data integrity upon loading:

import numpy as np
from PIL import Image

def image_to_csv(image_path, csv_path):
"""Convert an image to a NumPy array and save to CSV."""
img = Image.open(image_path)
img_array = np.asarray(img)

original_shape = img_array.shape
print(f"Original shape: {original_shape}")

# Reshape to 2D if needed
if img_array.ndim == 3:
img_2d = img_array.reshape(img_array.shape[0], -1)
else:
img_2d = img_array

# Save to CSV with integer format
np.savetxt(csv_path, img_2d, delimiter=',', fmt='%d')
print(f"Saved to {csv_path} (shape: {img_2d.shape})")

return original_shape


def csv_to_image(csv_path, original_shape):
"""Load image data from CSV and reconstruct the array."""
loaded_2d = np.loadtxt(csv_path, delimiter=',')

# Reshape back to original dimensions
loaded_image = loaded_2d.reshape(original_shape).astype(np.uint8)
print(f"Loaded shape: {loaded_image.shape}")

return loaded_image


# Save
shape = image_to_csv('photo.jpg', 'image_data.csv')

# Load
loaded = csv_to_image('image_data.csv', shape)

# Verify
original = np.asarray(Image.open('photo.jpg'))
if np.array_equal(original, loaded):
print("\nāœ… Data integrity verified: loaded data matches original.")
else:
print("\nāŒ Data mismatch detected.")

Output:

Original shape: (251, 335, 3)
Saved to image_data.csv (shape: (251, 1005))
Loaded shape: (251, 335, 3)

āœ… Data integrity verified: loaded data matches original.

Important Considerations​

CSV File Size

CSV files store pixel values as text, making them much larger than image files:

import os

original_size = os.path.getsize('photo.jpg')
csv_size = os.path.getsize('image_data.csv')

print(f"JPEG size: {original_size / 1024:.1f} KB")
print(f"CSV size: {csv_size / 1024:.1f} KB")
print(f"CSV is {csv_size / original_size:.1f}x larger")

Typical Output:

JPEG size: 45.2 KB
CSV size: 523.8 KB
CSV is 11.6x larger

For large images, consider using NumPy's binary format (.npy) instead of CSV for efficient storage:

# Save as .npy (binary, compact)
np.save('image_data.npy', img_array)

# Load from .npy
loaded = np.load('image_data.npy')
When to Use CSV vs Other Formats
FormatSizeSpeedHuman ReadableBest For
CSVLargeSlowāœ… YesSharing, inspection
.npySmall⚔ FastāŒ NoNumPy workflows
.npzSmallest⚔ FastāŒ NoMultiple arrays
Image (PNG/JPG)VariesFastVisualStandard image use

Use CSV when you need to inspect pixel values in a spreadsheet or share data with non-Python tools. Use .npy for everything else.

Conclusion​

Converting images to NumPy arrays and saving them to CSV involves three key steps:

  1. Read the image using PIL, Matplotlib, or OpenCV to get a NumPy array.
  2. Reshape 3D arrays to 2D since CSV files only support two-dimensional data - flatten the width and channel dimensions into a single axis.
  3. Save with np.savetxt() or pandas to_csv(), and reconstruct the original shape when loading.

For production workflows, consider using NumPy's native .npy format instead of CSV for significantly better performance and smaller file sizes. Reserve CSV for cases where human readability or cross-platform compatibility is important.