Python NumPy: How to Convert a 1D Array to 2D Array
Reshaping flat arrays into matrices is a fundamental operation in data science and machine learning. NumPy provides several methods to transform 1D arrays into 2D structures, each suited to different use cases.
Using reshape()
The reshape() method transforms an array into a specified shape without changing the underlying data:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
# Reshape to 3 rows, 2 columns
arr_2d = arr.reshape(3, 2)
print(arr_2d)
# [[1 2]
# [3 4]
# [5 6]]
print(arr_2d.shape) # (3, 2)
Output:
[[1 2]
[3 4]
[5 6]]
(3, 2)
Different Configurations
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
# 2 rows, 3 columns
print(arr.reshape(2, 3))
# [[1 2 3]
# [4 5 6]]
# 6 rows, 1 column
print(arr.reshape(6, 1))
# [[1]
# [2]
# [3]
# [4]
# [5]
# [6]]
# 1 row, 6 columns
print(arr.reshape(1, 6))
# [[1 2 3 4 5 6]]
Output:
[[1 2 3]
[4 5 6]]
[[1]
[2]
[3]
[4]
[5]
[6]]
[[1 2 3 4 5 6]]
The total number of elements must remain constant. Reshaping 6 elements into a 4×2 array raises an error since 4×2=8 ≠ 6.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
try:
arr.reshape(4, 2) # 4 * 2 = 8 elements needed
except ValueError as e:
print(f"Error: {e}")
# Error: cannot reshape array of size 6 into shape (4,2)
Output:
Error: cannot reshape array of size 6 into shape (4,2)
The -1 Shortcut for Automatic Dimension
Use -1 to let NumPy calculate one dimension automatically:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
# "Give me 4 columns, calculate rows automatically"
result = arr.reshape(-1, 4)
print(result.shape) # (3, 4)
print(result)
# [[ 1 2 3 4]
# [ 5 6 7 8]
# [ 9 10 11 12]]
# "Give me 3 rows, calculate columns automatically"
result = arr.reshape(3, -1)
print(result.shape) # (3, 4)
Output:
(3, 4)
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
(3, 4)
Common ML Pattern
Machine learning models often require specific input shapes:
import numpy as np
# Feature data as 1D array
features = np.array([1.5, 2.3, 3.1, 4.7, 5.2, 6.8])
# Reshape for model: each sample as a row
X = features.reshape(-1, 1) # 6 samples, 1 feature each
print(X.shape) # (6, 1)
print(X)
# [[1.5]
# [2.3]
# [3.1]
# [4.7]
# [5.2]
# [6.8]]
# Or: each sample has multiple features
X = features.reshape(-1, 2) # 3 samples, 2 features each
print(X.shape) # (3, 2)
Output:
(6, 1)
[[1.5]
[2.3]
[3.1]
[4.7]
[5.2]
[6.8]]
(3, 2)
The pattern array.reshape(-1, 1) is extremely common in scikit-learn when you need to convert a single feature into the expected 2D format.
Using np.newaxis for Adding Dimensions
np.newaxis inserts a new axis without restructuring the data:
import numpy as np
arr = np.array([1, 2, 3])
print(f"Original shape: {arr.shape}") # (3,)
# Convert to column vector
col_vector = arr[:, np.newaxis]
print(f"Column vector shape: {col_vector.shape}") # (3, 1)
print(col_vector)
# [[1]
# [2]
# [3]]
# Convert to row vector
row_vector = arr[np.newaxis, :]
print(f"Row vector shape: {row_vector.shape}") # (1, 3)
print(row_vector) # [[1 2 3]]
Output:
Original shape: (3,)
Column vector shape: (3, 1)
[[1]
[2]
[3]]
Row vector shape: (1, 3)
[[1 2 3]]
Equivalent Using None
None works identically to np.newaxis:
import numpy as np
arr = np.array([1, 2, 3])
col = arr[:, None] # Same as arr[:, np.newaxis]
row = arr[None, :] # Same as arr[np.newaxis, :]
print(col.shape) # (3, 1)
print(row.shape) # (1, 3)
Output:
(3, 1)
(1, 3)
Using reshape(-1, 1) vs [:, np.newaxis]
Both achieve similar results for column vectors:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
# Method 1: reshape
col1 = arr.reshape(-1, 1)
# Method 2: newaxis
col2 = arr[:, np.newaxis]
print(np.array_equal(col1, col2)) # True
print(col1.shape, col2.shape) # (5, 1) (5, 1)
Output:
True
(5, 1) (5, 1)
Views vs Copies
Reshape operations typically return views, not copies:
import numpy as np
original = np.array([1, 2, 3, 4, 5, 6])
reshaped = original.reshape(2, 3)
# Modifying reshaped affects original
reshaped[0, 0] = 99
print(original) # [99 2 3 4 5 6]
print(reshaped)
# [[99 2 3]
# [ 4 5 6]]
Output:
[99 2 3 4 5 6]
[[99 2 3]
[ 4 5 6]]
Creating an Independent Copy
import numpy as np
original = np.array([1, 2, 3, 4, 5, 6])
# Use .copy() for independent array
reshaped = original.reshape(2, 3).copy()
reshaped[0, 0] = 99
print(original) # [1 2 3 4 5 6] (unchanged)
print(reshaped[0, 0]) # 99
Output:
[1 2 3 4 5 6]
99
Row-Major vs Column-Major Order
Control how elements fill the new shape:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
# C order (row-major, default): fills row by row
c_order = arr.reshape(2, 3, order='C')
print("C order (row-major):")
print(c_order)
# [[1 2 3]
# [4 5 6]]
# Fortran order (column-major): fills column by column
f_order = arr.reshape(2, 3, order='F')
print("F order (column-major):")
print(f_order)
# [[1 3 5]
# [2 4 6]]
Output:
C order (row-major):
[[1 2 3]
[4 5 6]]
F order (column-major):
[[1 3 5]
[2 4 6]]
Practical Examples
Preparing Data for Scikit-learn
import numpy as np
# Single feature needs reshaping
temperatures = np.array([20.1, 22.4, 19.8, 25.3, 23.7])
# sklearn expects 2D: (n_samples, n_features)
X = temperatures.reshape(-1, 1)
print(f"Shape for sklearn: {X.shape}") # (5, 1)
Output:
Shape for sklearn: (5, 1)
Image Data Reshaping
import numpy as np
# Flatten image data (e.g., 28x28 pixels)
flat_image = np.arange(784) # 28 * 28 = 784
# Reshape to image dimensions
image_2d = flat_image.reshape(28, 28)
print(f"Image shape: {image_2d.shape}") # (28, 28)
# Or reshape batch of images
batch_flat = np.arange(784 * 100) # 100 images
batch = batch_flat.reshape(100, 28, 28)
print(f"Batch shape: {batch.shape}") # (100, 28, 28)
Output:
Image shape: (28, 28)
Batch shape: (100, 28, 28)
Time Series Windows
import numpy as np
# Create sliding windows from 1D time series
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
window_size = 3
# Reshape into overlapping windows (simple approach)
windows = np.array([data[i:i+window_size]
for i in range(len(data) - window_size + 1)])
print(windows)
Output:
[[1 2 3]
[2 3 4]
[3 4 5]
[4 5 6]
[5 6 7]
[6 7 8]
[7 8 9]]
Method Comparison
| Goal | Method | Result Shape |
|---|---|---|
| Specific dimensions | .reshape(rows, cols) | (rows, cols) |
| Auto-calculate rows | .reshape(-1, cols) | (auto, cols) |
| Auto-calculate columns | .reshape(rows, -1) | (rows, auto) |
| Column vector | [:, np.newaxis] | (n, 1) |
| Row vector | [np.newaxis, :] | (1, n) |
Summary
- Use
.reshape(rows, cols)when you know exact dimensions. - Use
.reshape(-1, n)to let NumPy calculate one dimension automatically-this is especially useful in ML pipelines. - Use
np.newaxiswhen you simply need to add a dimension without restructuring the data, particularly for converting 1D arrays to column or row vectors.