Python NumPy: How to Change NumPy Array Data Type in Python
Converting between data types is essential for memory optimization, mathematical operations, and compatibility with different libraries. NumPy's astype() method provides flexible type conversion while preserving array structure.
Using astype()
The astype() method creates a new array with the specified data type:
import numpy as np
arr = np.array([1, 2, 3])
print(f"Original type: {arr.dtype}") # int64 (or int32 on Windows)
# Convert to float
float_arr = arr.astype('float64')
print(f"New type: {float_arr.dtype}") # float64
print(float_arr) # [1. 2. 3.]
Output:
Original type: int64
New type: float64
[1. 2. 3.]
Multiple Ways to Specify Types
import numpy as np
arr = np.array([1, 2, 3])
# Using string
arr.astype('float64')
# Using NumPy type
arr.astype(np.float64)
# Using Python type
arr.astype(float)
# Using shorthand
arr.astype('f8') # 'f8' = float64 (8 bytes)
Common Data Types
| Type | String | Shorthand | Bytes | Range/Precision |
|---|---|---|---|---|
| Integer 8-bit | 'int8' | 'i1' | 1 | -128 to 127 |
| Integer 16-bit | 'int16' | 'i2' | 2 | -32,768 to 32,767 |
| Integer 32-bit | 'int32' | 'i4' | 4 | ±2.1 billion |
| Integer 64-bit | 'int64' | 'i8' | 8 | ±9.2 quintillion |
| Unsigned 8-bit | 'uint8' | 'u1' | 1 | 0 to 255 |
| Float 16-bit | 'float16' | 'f2' | 2 | ~3 decimal digits |
| Float 32-bit | 'float32' | 'f4' | 4 | ~7 decimal digits |
| Float 64-bit | 'float64' | 'f8' | 8 | ~15 decimal digits |
| Boolean | 'bool' | '?' | 1 | True/False |
| Complex | 'complex128' | 'c16' | 16 | Complex numbers |
Type Conversion Examples
Integer to Float
import numpy as np
int_arr = np.array([1, 2, 3, 4, 5])
float_arr = int_arr.astype(float)
print(float_arr) # [1. 2. 3. 4. 5.]
print(float_arr.dtype) # float64
Output:
[1. 2. 3. 4. 5.]
float64
Float to Integer (Truncation Warning)
import numpy as np
float_arr = np.array([1.9, 2.1, 3.5, -1.7])
# astype truncates toward zero - does NOT round!
int_arr = float_arr.astype(int)
print(int_arr) # [1 2 3 -1]
# To round first, use np.round()
rounded = np.round(float_arr).astype(int)
print(rounded) # [2 2 4 -2]
Output:
[ 1 2 3 -1]
[ 2 2 4 -2]
Converting float to integer truncates the decimal portion rather than rounding. Use np.round(), np.floor(), or np.ceil() first if you need specific rounding behavior.
Numeric to Boolean
import numpy as np
arr = np.array([0, 1, 2, -1, 0, 5])
bool_arr = arr.astype(bool)
print(bool_arr) # [False True True True False True]
# 0 → False, any non-zero → True
Output:
[False True True True False True]
Boolean to Numeric
import numpy as np
bool_arr = np.array([True, False, True, True])
int_arr = bool_arr.astype(int)
print(int_arr) # [1 0 1 1]
float_arr = bool_arr.astype(float)
print(float_arr) # [1. 0. 1. 1.]
Output:
[1 0 1 1]
[1. 0. 1. 1.]
String Conversions
import numpy as np
# Numbers to strings
num_arr = np.array([1, 2, 3])
str_arr = num_arr.astype(str)
print(str_arr) # ['1' '2' '3']
print(str_arr.dtype) # <U21 (Unicode string)
# Strings to numbers
str_arr = np.array(['1.5', '2.7', '3.9'])
float_arr = str_arr.astype(float)
print(float_arr) # [1.5 2.7 3.9]
Output:
['1' '2' '3']
<U21
[1.5 2.7 3.9]
Memory Optimization
Smaller data types reduce memory usage for large arrays:
import numpy as np
# Default: 8 bytes per integer
large_arr = np.arange(1_000_000)
print(f"int64 size: {large_arr.nbytes / 1e6:.1f} MB") # 8.0 MB
# Downsize to 1 byte per integer
small_arr = large_arr.astype(np.int8)
print(f"int8 size: {small_arr.nbytes / 1e6:.1f} MB") # 1.0 MB
# For floats: float32 vs float64
float64_arr = np.random.rand(1_000_000)
print(f"float64: {float64_arr.nbytes / 1e6:.1f} MB") # 8.0 MB
float32_arr = float64_arr.astype(np.float32)
print(f"float32: {float32_arr.nbytes / 1e6:.1f} MB") # 4.0 MB
Output:
int64 size: 8.0 MB
int8 size: 1.0 MB
float64: 8.0 MB
float32: 4.0 MB
Use float32 instead of float64 for machine learning when full precision isn't required. Many GPU libraries prefer or require 32-bit floats.
Overflow Warning
Smaller types have limited ranges-exceeding them causes overflow:
import numpy as np
arr = np.array([100, 200, 300])
# int8 range: -128 to 127
int8_arr = arr.astype(np.int8)
print(int8_arr) # [100 -56 44] ← Overflow!
# uint8 range: 0 to 255
uint8_arr = arr.astype(np.uint8)
print(uint8_arr) # [100 200 44] ← 300 overflows
# Check if values fit before converting
if arr.max() <= 127 and arr.min() >= -128:
safe_arr = arr.astype(np.int8)
else:
print("Values out of int8 range!")
Output:
[100 -56 44]
[100 200 44]
Values out of int8 range!
NumPy doesn't warn about overflow by default. Always verify your data range before downcasting to smaller types.
Checking and Comparing Types
import numpy as np
arr = np.array([1.5, 2.5, 3.5])
# Check current type
print(arr.dtype) # float64
print(arr.dtype.name) # 'float64'
print(arr.dtype.itemsize) # 8 (bytes per element)
# Check if specific type
print(arr.dtype == np.float64) # True
print(np.issubdtype(arr.dtype, np.floating)) # True
print(np.issubdtype(arr.dtype, np.integer)) # False
Output:
float64
float64
8
True
True
False
Setting Type at Creation
Specify dtype when creating arrays to avoid conversion overhead:
import numpy as np
# Specify type at creation
arr = np.array([1, 2, 3], dtype=np.float32)
print(arr.dtype) # float32
# With zeros/ones/empty
zeros = np.zeros((3, 3), dtype=np.int8)
ones = np.ones((2, 2), dtype=np.float32)
# With arange
integers = np.arange(10, dtype=np.int16)
Output:
float32
Copy vs View Behavior
astype() always creates a copy, even when the type doesn't change:
import numpy as np
arr = np.array([1, 2, 3], dtype=np.int32)
# Same type still creates a copy
same_type = arr.astype(np.int32)
same_type[0] = 99
print(arr) # [1 2 3] (unchanged)
print(same_type) # [99 2 3]
# Use copy=False to avoid copy when possible
# (only works if no conversion needed)
no_copy = arr.astype(np.int32, copy=False)
Output:
[1 2 3]
[99 2 3]
Practical Examples
Preparing Data for Image Processing
import numpy as np
# Image pixel values (0-255) as float for processing
image_float = np.random.rand(100, 100) * 255
# Convert to uint8 for saving/display
image_uint8 = image_float.astype(np.uint8)
print(f"Type: {image_uint8.dtype}, Range: {image_uint8.min()}-{image_uint8.max()}")
Output:
Type: uint8, Range: 0-254
Optimizing ML Features
import numpy as np
# Original features
features = np.random.randn(10000, 100) # float64 default
print(f"Original: {features.nbytes / 1e6:.1f} MB")
# Convert to float32 for GPU compatibility
features_32 = features.astype(np.float32)
print(f"Optimized: {features_32.nbytes / 1e6:.1f} MB")
Output:
Original: 8.0 MB
Optimized: 4.0 MB
Converting Categorical Labels
import numpy as np
# String labels
labels = np.array(['cat', 'dog', 'cat', 'bird', 'dog'])
# Create numeric mapping
unique_labels = np.unique(labels)
label_to_int = {label: i for i, label in enumerate(unique_labels)}
# Convert to integers
numeric_labels = np.array([label_to_int[l] for l in labels])
print(numeric_labels) # [1 2 1 0 2] (bird=0, cat=1, dog=2)
Output:
[1 2 1 0 2]
Quick Reference
| Goal | Command |
|---|---|
| To float64 | arr.astype(float) or arr.astype(np.float64) |
| To int64 | arr.astype(int) or arr.astype(np.int64) |
| To float32 (GPU) | arr.astype(np.float32) |
| To uint8 (images) | arr.astype(np.uint8) |
| To boolean | arr.astype(bool) |
| To string | arr.astype(str) |
| Check type | arr.dtype |
| Memory size | arr.nbytes |
Summary
Use astype() to convert NumPy array data types.
Remember that float-to-integer conversion truncates rather than rounds. Choose smaller types like float32 or int16 to reduce memory usage for large arrays, but verify your data fits within the type's range to avoid overflow. Specify dtype at array creation when possible to avoid unnecessary conversion steps.