Python Pandas: How to Create a Pandas Series from a NumPy Array
NumPy arrays excel at fast numerical computation, but they only support integer-based positional indexing. Pandas Series builds on top of NumPy arrays by adding labeled indices, built-in missing data handling, and rich metadata, making your data more accessible and self-documenting. Converting a NumPy array to a Pandas Series lets you combine NumPy's computational speed with Pandas' powerful analytical features.
In this guide, you will learn how to create a Series from a NumPy array, add meaningful index labels, understand memory sharing behavior, and use the Series features that go beyond what raw arrays offer.
Basic Conversion
Pass a NumPy array directly to the pd.Series() constructor:
import pandas as pd
import numpy as np
arr = np.array([10, 20, 30, 40])
s = pd.Series(arr)
print(s)
Output:
0 10
1 20
2 30
3 40
dtype: int64
Pandas automatically assigns a default integer index (0, 1, 2, 3) and preserves the data type from the original NumPy array.
Adding Custom Index Labels
The key advantage of a Series over a plain array is the ability to attach meaningful labels to each value. Pass an index parameter with a list of labels:
import pandas as pd
import numpy as np
prices = np.array([29.99, 49.99, 19.99])
products = ['Widget', 'Gadget', 'Gizmo']
s = pd.Series(prices, index=products)
print(s)
print()
print(f"Gadget price: ${s['Gadget']}")
Output:
Widget 29.99
Gadget 49.99
Gizmo 19.99
dtype: float64
Gadget price: $49.99
Instead of remembering that index 1 holds the Gadget price, you can access it directly by name. This makes code more readable and less prone to off-by-one errors.
Naming the Series
The name parameter adds an identifier to the Series, which becomes the column name if the Series is later added to a DataFrame:
import pandas as pd
import numpy as np
arr = np.array([100, 200, 300])
s = pd.Series(arr, index=['Jan', 'Feb', 'Mar'], name='Revenue')
print(s)
Output:
Jan 100
Feb 200
Mar 300
Name: Revenue, dtype: int64
Understanding Memory Sharing Behavior
By default, a Pandas Series may share memory with the source NumPy array rather than creating an independent copy. This means modifying the original array can unexpectedly change the Series:
import pandas as pd
import numpy as np
arr = np.array([1, 2, 3])
s = pd.Series(arr)
# Modifying the original array affects the Series
arr[0] = 99
print(f"Series after modifying array: {s[0]}")
Output:
Series after modifying array: 99
To create an independent Series that is not affected by changes to the source array, use .copy():
import pandas as pd
import numpy as np
arr = np.array([1, 2, 3])
s = pd.Series(arr.copy())
# Now the Series is independent
arr[0] = 99
print(f"Series after modifying array: {s[0]}")
Output:
Series after modifying array: 1
Memory sharing behavior can vary depending on the Pandas version and the data types involved. If your workflow modifies the original array after creating the Series, always use .copy() to avoid subtle bugs.
The Array Must Be One-Dimensional
A Pandas Series is strictly one-dimensional. Attempting to create a Series from a multi-dimensional array will raise an error:
import pandas as pd
import numpy as np
# Works: 1D array
arr_1d = np.array([1, 2, 3])
s = pd.Series(arr_1d)
print(s)
print()
# Fails: 2D array
arr_2d = np.array([[1, 2], [3, 4]])
try:
s = pd.Series(arr_2d)
except Exception as e:
print(f"Error: {e}")
Output:
0 1
1 2
2 3
dtype: int64
Error: Data must be 1-dimensional, got ndarray of shape (2, 2) instead
If you need to convert a 2D array, use pd.DataFrame() instead, or flatten the array first with arr_2d.flatten().
Common Operations on a Series
Once converted, a Series provides several capabilities beyond what NumPy arrays offer:
import pandas as pd
import numpy as np
arr = np.array([10, 20, 30, 40, 50])
s = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])
# Label-based access
print(f"Value at 'c': {s['c']}")
print()
# Slicing by label (inclusive on both ends)
print("Slice 'b' to 'd':")
print(s['b':'d'])
print()
# Boolean filtering
print("Values greater than 25:")
print(s[s > 25])
print()
# Built-in statistics
print(f"Mean: {s.mean()}, Sum: {s.sum()}, Std: {s.std():.2f}")
Output:
Value at 'c': 30
Slice 'b' to 'd':
b 20
c 30
d 40
dtype: int64
Values greater than 25:
c 30
d 40
e 50
dtype: int64
Mean: 30.0, Sum: 150, Std: 15.81
Label-based slicing in Pandas is inclusive on both ends (s['b':'d'] includes 'd'), unlike standard Python slicing which excludes the end point. This is a common source of confusion when transitioning from NumPy to Pandas.
Series Features vs. NumPy Arrays
| Feature | NumPy Array | Pandas Series |
|---|---|---|
| Speed | Fastest | Slightly slower |
| Indexing | Position only | Labels or position |
| Missing data | Manual handling | Built-in NaN support |
| Metadata | None | Name, index labels |
| Dimensions | Any (1D, 2D, nD) | Strictly 1D |
| Alignment | By position | Automatic index alignment |
Quick Reference
| Parameter | Purpose | Example |
|---|---|---|
data | Source NumPy array | np.array([1, 2, 3]) |
index | Custom labels | ['a', 'b', 'c'] |
name | Series identifier | 'Revenue' |
dtype | Force a data type | 'float64' |
copy | Force independent copy | True |
- Use
pd.Series(array)to add labeled indexing and Pandas functionality to NumPy arrays. - Provide custom indices for meaningful, name-based data access.
- Remember that a Series must be one-dimensional, and use
.copy()when you need the Series to be independent of the source array.