How to Choose Between Python Lists and Arrays
Python's built-in list is versatile but not optimized for numerical work. Understanding when to use lists, array.array, or NumPy arrays helps you write more efficient code.
Quick Comparison
| Feature | List | array.array | NumPy ndarray |
|---|---|---|---|
| Element Types | Mixed | Single type | Single type |
| Memory Usage | High | Low | Low |
| Math Operations | ❌ Manual loops | ❌ Manual loops | ✅ Vectorized |
| Speed | Slow for numerics | Moderate | Fast |
| Import Required | No | import array | import numpy |
Python Lists: The Flexible Default
Lists store any combination of types and work for general-purpose programming:
# Mixed types allowed
data = [1, "hello", 3.14, None]
# Easy manipulation
data.append(True)
data.remove("hello")
# But math requires loops
numbers = [1, 2, 3, 4] # [1, 2, 3, 4]
doubled = [x * 2 for x in numbers] # [2, 4, 6, 8]
print(numbers)
print(doubled)
Multiplying a list repeats it rather than performing element-wise math:
[1, 2, 3] * 2 # [1, 2, 3, 1, 2, 3]
array.array: Compact Storage
The array module provides memory-efficient storage for homogeneous numeric data:
import array
# 'i' = signed integer, 'f' = float, 'd' = double
integers = array.array('i', [1, 2, 3, 4, 5])
floats = array.array('d', [1.1, 2.2, 3.3])
# Same list-like operations
integers.append(6)
integers.pop()
# But still no vectorized math
# doubled = integers * 2 # Repeats array, not element-wise!
Common type codes:
| Code | Type | Size |
|---|---|---|
'b' | signed char | 1 byte |
'i' | signed int | 2-4 bytes |
'f' | float | 4 bytes |
'd' | double | 8 bytes |
NumPy Arrays: Numerical Powerhouse
NumPy is the standard for numerical computing, offering vectorized operations and optimized performance:
import numpy as np
arr = np.array([1, 2, 3, 4])
# Vectorized math - no loops needed
print(arr * 2) # [2 4 6 8]
print(arr + 10) # [11 12 13 14]
print(arr ** 2) # [1 4 9 16]
# Element-wise operations between arrays
other = np.array([10, 20, 30, 40])
print(arr + other) # [11 22 33 44]
# Built-in mathematical functions
print(np.sum(arr)) # 10
print(np.mean(arr)) # 2.5
print(np.std(arr)) # 1.118...
Output:
[2 4 6 8]
[11 12 13 14]
[ 1 4 9 16]
[11 22 33 44]
10
2.5
1.118033988749895
NumPy operations run in optimized C code, making them orders of magnitude faster than Python loops for large datasets:
import numpy as np
# Fast: vectorized operation
arr = np.arange(1_000_000)
result = arr * 2
# Slow: Python loop
lst = list(range(1_000_000))
result = [x * 2 for x in lst]
Memory Comparison
import sys
import array
import numpy as np
n = 1000
py_list = list(range(n))
arr = array.array('i', range(n))
np_arr = np.arange(n, dtype=np.int32)
print(sys.getsizeof(py_list)) # ~8056 bytes
print(arr.buffer_info()[1] * arr.itemsize) # ~4000 bytes
print(np_arr.nbytes) # ~4000 bytes
Output:
8056
4000
4000
Decision Guide
| Use Case | Best Choice |
|---|---|
| General programming | List |
| Mixed data types | List |
| Memory-constrained, no NumPy available | array.array |
| Numerical computation | NumPy |
| Data science / Machine learning | NumPy |
| Large datasets | NumPy |
Summary
- List: Default choice for everyday Python code
- array.array: Lightweight option when you need compact numeric storage without external dependencies
- NumPy: Essential for any mathematical or scientific computing
For most numerical work, NumPy is the clear winner: its vectorized operations and ecosystem integration make it indispensable for data processing.