Skip to main content

Python NumPy: How to Convert a NumPy Array to Pandas Series

Converting NumPy arrays to Pandas Series adds labeled indexing, built-in missing data handling, and rich statistical methods. The conversion is straightforward and preserves the underlying data efficiently.

Basic Conversion

Pass the array directly to pd.Series():

import numpy as np
import pandas as pd

arr = np.array([10, 20, 30, 40])
series = pd.Series(arr)

print(series)

Output:

0    10
1 20
2 30
3 40
dtype: int64

Adding Custom Index

Assign meaningful labels instead of default integer indices:

import numpy as np
import pandas as pd

arr = np.array([10, 20, 30, 40])

# With string labels
series = pd.Series(arr, index=['Q1', 'Q2', 'Q3', 'Q4'])
print(series)

Output:

Q1    10
Q2 20
Q3 30
Q4 40
dtype: int64

Label-Based Access

import numpy as np
import pandas as pd

arr = np.array([100, 200, 300])
series = pd.Series(arr, index=['A', 'B', 'C'])

# Access by label
print(series['A']) # 100
print(series['B':'C']) # B: 200, C: 300

# Access by position still works
print(series.iloc[0]) # 100

Output:

100
B 200
C 300
dtype: int64
100

Adding a Name

Give the Series a descriptive name:

import numpy as np
import pandas as pd

arr = np.array([1500, 2300, 1800])

series = pd.Series(
arr,
index=['Jan', 'Feb', 'Mar'],
name='Monthly Sales'
)

print(series)

Output:

Jan    1500
Feb 2300
Mar 1800
Name: Monthly Sales, dtype: int6
tip

The name attribute becomes the column header when the Series is converted to a DataFrame.

Specifying Data Type

Control the Series dtype explicitly:

import numpy as np
import pandas as pd

arr = np.array([1, 2, 3])

# Convert to float
series_float = pd.Series(arr, dtype='float64')
print(series_float.dtype) # float64

# Convert to string
series_str = pd.Series(arr, dtype='str')
print(series_str)

Output:

float64
0 1
1 2
2 3
dtype: object

Why Use Series Over Arrays?

FeatureNumPy ArrayPandas Series
Index typeInteger onlyAny hashable labels
Missing dataManual handlingBuilt-in NaN support
AlignmentBy positionBy label (automatic)
StatisticsBasic (mean, std)Rich (describe, value_counts)
String methodsLimitedFull .str accessor
Datetime methodsLimitedFull .dt accessor

Practical Comparison

import numpy as np
import pandas as pd

arr = np.array([10, 20, 30, 40, 50])
series = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])

# Rich statistics with Series
print(series.describe())
print()

# Value counts (useful for categorical data)
data = np.array(['A', 'B', 'A', 'C', 'A', 'B'])
print(pd.Series(data).value_counts())

Output:

count     5.000000
mean 30.000000
std 15.811388
min 10.000000
25% 20.000000
50% 30.000000
75% 40.000000
max 50.000000
dtype: float64

A 3
B 2
C 1
Name: count, dtype: int64

Memory Sharing Behavior

By default, Series may share memory with the source array:

import numpy as np
import pandas as pd

arr = np.array([1, 2, 3])
series = pd.Series(arr)

# Modifying array affects series (shared memory)
arr[0] = 99
print(series[0]) # 99

# Create independent copy
arr = np.array([1, 2, 3])
series = pd.Series(arr.copy())
# or: series = pd.Series(arr).copy()

arr[0] = 99
print(series[0]) # 1 (unchanged)

Output:

99
1
warning

If you modify the original array, the Series may change too. Use .copy() when you need independent data.

Converting Different Array Types

Multi-dimensional Arrays

Flatten first or select a column:

import numpy as np
import pandas as pd

arr_2d = np.array([[1, 2], [3, 4], [5, 6]])

# Flatten to 1D
series_flat = pd.Series(arr_2d.flatten())
print(series_flat) # 0:1, 1:2, 2:3, 3:4, 4:5, 5:6
print()

# Use specific column
series_col = pd.Series(arr_2d[:, 0]) # First column
print(series_col) # 0:1, 1:3, 2:5

Output:

0    1
1 2
2 3
3 4
4 5
5 6
dtype: int64

0 1
1 3
2 5
dtype: int64

Datetime Arrays

import numpy as np
import pandas as pd

dates = np.array(['2024-01-01', '2024-02-01', '2024-03-01'], dtype='datetime64')

series = pd.Series(dates)
print(series.dt.month) # Access datetime properties

Output:

0    1
1 2
2 3
dtype: int32

Practical Examples

Time Series Data

import numpy as np
import pandas as pd

# Stock prices
prices = np.array([150.25, 152.30, 149.80, 155.00])
dates = pd.date_range('2024-01-01', periods=4)

stock = pd.Series(prices, index=dates, name='AAPL')
print(stock)

# Calculate returns
returns = stock.pct_change()
print(f"\nDaily returns:\n{returns}")

Output:

2024-01-01    150.25
2024-01-02 152.30
2024-01-03 149.80
2024-01-04 155.00
Freq: D, Name: AAPL, dtype: float64

Daily returns:
2024-01-01 NaN
2024-01-02 0.013644
2024-01-03 -0.016415
2024-01-04 0.034713
Freq: D, Name: AAPL, dtype: float64

Categorical Data with Counts

import numpy as np
import pandas as pd

# Survey responses
responses = np.array(['Yes', 'No', 'Yes', 'Yes', 'No', 'Maybe'])
series = pd.Series(responses)

# Frequency analysis
print(series.value_counts())
print(f"\nMode: {series.mode()[0]}")

Output:

Yes      3
No 2
Maybe 1
Name: count, dtype: int64

Mode: Yes

Handling Missing Data

import numpy as np
import pandas as pd

arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
series = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])

# Built-in missing data methods
print(f"Count (non-null): {series.count()}") # 3
print(f"Mean (ignores NaN): {series.mean()}") # 3.0
print(f"Filled:\n{series.fillna(0)}")

Output:

import numpy as np
import pandas as pd

arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
series = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])

# Built-in missing data methods
print(f"Count (non-null): {series.count()}") # 3
print(f"Mean (ignores NaN): {series.mean()}") # 3.0
print(f"Filled:\n{series.fillna(0)}")

Quick Reference

GoalCode
Basic conversionpd.Series(arr)
With labelspd.Series(arr, index=['a', 'b'])
With namepd.Series(arr, name='Sales')
With dtypepd.Series(arr, dtype='float64')
Independent copypd.Series(arr).copy()
From 2D columnpd.Series(arr_2d[:, 0])

Summary

  • Use pd.Series(arr) for quick conversion from NumPy arrays.
  • Add index for labeled access and name for descriptive identification.
  • The Series provides richer functionality than arrays-including automatic alignment, missing data handling, and comprehensive statistical methods-making it ideal for data analysis tasks.