Skip to main content

How to Declare an Array in Python

Python offers several ways to work with array-like data structures, each designed for different use cases. Unlike languages such as C or Java where arrays have a single fixed type, Python provides lists, the array module, and NumPy arrays, each with distinct trade-offs in flexibility, memory usage, and performance.

In this guide, you will learn how to declare and use each type of array in Python, understand when to choose one over another, and avoid common pitfalls that can lead to bugs or performance problems.

Python Lists

Lists are the default and most versatile choice for ordered collections in Python. They are dynamic, support mixed types, and require no imports:

# Empty list
arr = []

# List with initial values
arr = [1, 2, 3, 4, 5]

# Mixed types (allowed but often discouraged in practice)
mixed = [1, "hello", 3.14, True]

# List from range
numbers = list(range(10))
print(numbers)

Output:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Common List Operations

arr = [1, 2, 3]

arr.append(4) # Add to end: [1, 2, 3, 4]
arr.insert(0, 0) # Insert at index: [0, 1, 2, 3, 4]
arr.extend([5, 6]) # Add multiple: [0, 1, 2, 3, 4, 5, 6]
arr.pop() # Remove last: [0, 1, 2, 3, 4, 5]
arr[0] = 10 # Modify by index: [10, 1, 2, 3, 4, 5]

Creating Lists with List Comprehensions

List comprehensions offer a concise way to create lists with transformations or filtering:

# Squares of numbers
squares = [x ** 2 for x in range(10)]
print(squares) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# Filtered values
evens = [x for x in range(20) if x % 2 == 0]
print(evens) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Creating 2D Lists (Nested Lists)

# 2D list using list comprehension
matrix = [[i * 3 + j for j in range(3)] for i in range(3)]

for row in matrix:
print(row)

Output:

[0, 1, 2]
[3, 4, 5]
[6, 7, 8]

A Common Mistake: 2D Lists with the Multiplication Operator

A frequent source of bugs is creating 2D lists using the * operator:

# Wrong: all rows share the same inner list object
bad_matrix = [[0] * 3] * 3
bad_matrix[0][0] = 99
print(bad_matrix)
# [[99, 0, 0], [99, 0, 0], [99, 0, 0]]

Output:

[[99, 0, 0], [99, 0, 0], [99, 0, 0]]

Setting bad_matrix[0][0] changed all three rows because they are references to the same list object. Use a comprehension instead:

# Correct: each row is an independent list
good_matrix = [[0] * 3 for _ in range(3)]
good_matrix[0][0] = 99
print(good_matrix)
# [[99, 0, 0], [0, 0, 0], [0, 0, 0]]
warning

The [[value] * cols] * rows pattern creates rows references to the same inner list. Always use [[value] * cols for _ in range(rows)] to create independent rows.

The array Module

For memory-efficient storage of homogeneous numeric data, Python's built-in array module stores raw values rather than full Python objects:

import array

# Type codes: 'i' = int, 'f' = float, 'd' = double
int_array = array.array("i", [1, 2, 3, 4, 5])
float_array = array.array("f", [1.0, 2.5, 3.7])

print(int_array) # array('i', [1, 2, 3, 4, 5])
print(float_array) # array('f', [1.0, 2.5, 3.700000047683716])

Operations like append and extend work similarly to lists, but type enforcement is strict:

import array

int_array = array.array("i", [1, 2, 3])

int_array.append(4)
print(int_array)

# Attempting to add a string raises TypeError
try:
int_array.append("hello")
except TypeError as e:
print(f"Error: {e}")

Output:

array('i', [1, 2, 3, 4])
Error: an integer is required (got type str)

Common Type Codes

CodeC TypePython TypeSize (bytes)
'b'signed charint1
'i'signed intint2-4
'l'signed longint4-8
'f'floatfloat4
'd'doublefloat8
tip

The array module uses significantly less memory than lists for large numeric datasets because it stores raw typed values instead of Python objects. Each element in a list requires a pointer to a Python object (about 28 bytes for an integer), while an array.array('i') stores each integer in just 4 bytes.

NumPy Arrays

NumPy is the standard library for scientific computing and data analysis in Python. It provides powerful multidimensional arrays with vectorized operations that are dramatically faster than equivalent list-based code:

pip install numpy
import numpy as np

# Create from list
arr = np.array([1, 2, 3, 4, 5])
print(arr)

# Create with specific dtype
floats = np.array([1, 2, 3], dtype=np.float64)
print(floats)

Output:

[1 2 3 4 5]
[1. 2. 3.]

Array Initialization Functions

NumPy provides several convenience functions for creating arrays:

import numpy as np

zeros = np.zeros(5)
print(f"zeros: {zeros}")

ones = np.ones(5)
print(f"ones: {ones}")

full = np.full(5, 7)
print(f"full: {full}")

range_arr = np.arange(0, 10, 2)
print(f"arange: {range_arr}")

linspace = np.linspace(0, 1, 5)
print(f"linspace: {linspace}")

Output:

zeros:    [0. 0. 0. 0. 0.]
ones: [1. 1. 1. 1. 1.]
full: [7 7 7 7 7]
arange: [0 2 4 6 8]
linspace: [0. 0.25 0.5 0.75 1. ]

Vectorized Operations

NumPy's greatest strength is performing element-wise operations without explicit loops:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr * 2)
print(arr ** 2)
print(np.sqrt(arr))
print(f"Sum: {np.sum(arr)}, Mean: {np.mean(arr)}")

Output:

[ 2  4  6  8 10]
[ 1 4 9 16 25]
[1. 1.41421356 1.73205081 2. 2.23606798]
Sum: 15, Mean: 3.0

Multidimensional Arrays

import numpy as np

# 2D array (matrix)
matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])

print(f"Shape: {matrix.shape}")
print(f"Element at row 1, col 2: {matrix[1, 2]}")

# Initialize 2D arrays
zeros_2d = np.zeros((3, 4))
identity = np.eye(3)

print(f"\n3x4 zeros:\n{zeros_2d}")
print(f"\n3x3 identity:\n{identity}")

Output:

Shape: (3, 3)
Element at row 1, col 2: 6

3x4 zeros:
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]

3x3 identity:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]

Pre-Allocating Fixed-Size Arrays

When you know the size of your array in advance, pre-allocating avoids repeated resizing:

# List with fixed size
fixed_list = [0] * 100
print(f"Fixed list length: {len(fixed_list)}")

# NumPy pre-allocation
import numpy as np

fixed_np = np.zeros(1000)
print(f"NumPy array length: {len(fixed_np)}")

Output:

Fixed list length: 100
NumPy array length: 1000

Memory Comparison

To see the memory difference between the three array types:

import sys
import array
import numpy as np

n = 1000

py_list = list(range(n))
arr_module = array.array("i", range(n))
np_array = np.arange(n, dtype=np.int32)

print(f"List: {sys.getsizeof(py_list):>6} bytes")
print(f"array.array: {sys.getsizeof(arr_module):>6} bytes")
print(f"NumPy: {np_array.nbytes:>6} bytes")

Example output:

List:          8056 bytes
array.array: 4200 bytes
NumPy: 4000 bytes

The Python list uses roughly twice the memory because each element is stored as a full Python object with its own overhead, while array.array and NumPy store raw typed values contiguously.

Comparison Summary

TypeImport RequiredData TypesSpeedMemoryBest For
listNoneMixedModerateHighGeneral-purpose programming
array.arrayarray (stdlib)Numeric onlyMediumLowMemory-constrained, no external deps
numpy.ndarraynumpyNumeric onlyFastLowMath, science, data analysis
note

Use lists for general-purpose programming and mixed data types. Use NumPy for numerical computing, data science, and performance-critical math operations. Use array.array only when you need memory efficiency for numeric data without adding external dependencies.

Conclusion

For most Python code, lists are the appropriate and simplest choice. They are built into the language, require no imports, and handle a wide variety of use cases.

  • When working with large numeric datasets, switch to NumPy arrays for vectorized operations, lower memory usage, and dramatically faster computation.
  • The array module fills a narrow niche where you need compact numeric storage without installing third-party packages.
  • Choose the array type that matches your data, your performance requirements, and the libraries your project already depends on.