Skip to main content

Python Pandas: How to Create a Pandas Series from a NumPy Array

NumPy arrays excel at fast numerical computation, but they only support integer-based positional indexing. Pandas Series builds on top of NumPy arrays by adding labeled indices, built-in missing data handling, and rich metadata, making your data more accessible and self-documenting. Converting a NumPy array to a Pandas Series lets you combine NumPy's computational speed with Pandas' powerful analytical features.

In this guide, you will learn how to create a Series from a NumPy array, add meaningful index labels, understand memory sharing behavior, and use the Series features that go beyond what raw arrays offer.

Basic Conversion

Pass a NumPy array directly to the pd.Series() constructor:

import pandas as pd
import numpy as np

arr = np.array([10, 20, 30, 40])
s = pd.Series(arr)

print(s)

Output:

0    10
1 20
2 30
3 40
dtype: int64

Pandas automatically assigns a default integer index (0, 1, 2, 3) and preserves the data type from the original NumPy array.

Adding Custom Index Labels

The key advantage of a Series over a plain array is the ability to attach meaningful labels to each value. Pass an index parameter with a list of labels:

import pandas as pd
import numpy as np

prices = np.array([29.99, 49.99, 19.99])
products = ['Widget', 'Gadget', 'Gizmo']

s = pd.Series(prices, index=products)

print(s)
print()
print(f"Gadget price: ${s['Gadget']}")

Output:

Widget    29.99
Gadget 49.99
Gizmo 19.99
dtype: float64

Gadget price: $49.99

Instead of remembering that index 1 holds the Gadget price, you can access it directly by name. This makes code more readable and less prone to off-by-one errors.

Naming the Series

The name parameter adds an identifier to the Series, which becomes the column name if the Series is later added to a DataFrame:

import pandas as pd
import numpy as np

arr = np.array([100, 200, 300])
s = pd.Series(arr, index=['Jan', 'Feb', 'Mar'], name='Revenue')

print(s)

Output:

Jan    100
Feb 200
Mar 300
Name: Revenue, dtype: int64

Understanding Memory Sharing Behavior

By default, a Pandas Series may share memory with the source NumPy array rather than creating an independent copy. This means modifying the original array can unexpectedly change the Series:

import pandas as pd
import numpy as np

arr = np.array([1, 2, 3])
s = pd.Series(arr)

# Modifying the original array affects the Series
arr[0] = 99
print(f"Series after modifying array: {s[0]}")

Output:

Series after modifying array: 99

To create an independent Series that is not affected by changes to the source array, use .copy():

import pandas as pd
import numpy as np

arr = np.array([1, 2, 3])
s = pd.Series(arr.copy())

# Now the Series is independent
arr[0] = 99
print(f"Series after modifying array: {s[0]}")

Output:

Series after modifying array: 1
warning

Memory sharing behavior can vary depending on the Pandas version and the data types involved. If your workflow modifies the original array after creating the Series, always use .copy() to avoid subtle bugs.

The Array Must Be One-Dimensional

A Pandas Series is strictly one-dimensional. Attempting to create a Series from a multi-dimensional array will raise an error:

import pandas as pd
import numpy as np

# Works: 1D array
arr_1d = np.array([1, 2, 3])
s = pd.Series(arr_1d)
print(s)
print()

# Fails: 2D array
arr_2d = np.array([[1, 2], [3, 4]])
try:
s = pd.Series(arr_2d)
except Exception as e:
print(f"Error: {e}")

Output:

0    1
1 2
2 3
dtype: int64

Error: Data must be 1-dimensional, got ndarray of shape (2, 2) instead

If you need to convert a 2D array, use pd.DataFrame() instead, or flatten the array first with arr_2d.flatten().

Common Operations on a Series

Once converted, a Series provides several capabilities beyond what NumPy arrays offer:

import pandas as pd
import numpy as np

arr = np.array([10, 20, 30, 40, 50])
s = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])

# Label-based access
print(f"Value at 'c': {s['c']}")
print()

# Slicing by label (inclusive on both ends)
print("Slice 'b' to 'd':")
print(s['b':'d'])
print()

# Boolean filtering
print("Values greater than 25:")
print(s[s > 25])
print()

# Built-in statistics
print(f"Mean: {s.mean()}, Sum: {s.sum()}, Std: {s.std():.2f}")

Output:

Value at 'c': 30

Slice 'b' to 'd':
b 20
c 30
d 40
dtype: int64

Values greater than 25:
c 30
d 40
e 50
dtype: int64

Mean: 30.0, Sum: 150, Std: 15.81
info

Label-based slicing in Pandas is inclusive on both ends (s['b':'d'] includes 'd'), unlike standard Python slicing which excludes the end point. This is a common source of confusion when transitioning from NumPy to Pandas.

Series Features vs. NumPy Arrays

FeatureNumPy ArrayPandas Series
SpeedFastestSlightly slower
IndexingPosition onlyLabels or position
Missing dataManual handlingBuilt-in NaN support
MetadataNoneName, index labels
DimensionsAny (1D, 2D, nD)Strictly 1D
AlignmentBy positionAutomatic index alignment

Quick Reference

ParameterPurposeExample
dataSource NumPy arraynp.array([1, 2, 3])
indexCustom labels['a', 'b', 'c']
nameSeries identifier'Revenue'
dtypeForce a data type'float64'
copyForce independent copyTrue
  • Use pd.Series(array) to add labeled indexing and Pandas functionality to NumPy arrays.
  • Provide custom indices for meaningful, name-based data access.
  • Remember that a Series must be one-dimensional, and use .copy() when you need the Series to be independent of the source array.