Python NumPy: How to Convert a NumPy Array to Pandas Series
Converting NumPy arrays to Pandas Series adds labeled indexing, built-in missing data handling, and rich statistical methods. The conversion is straightforward and preserves the underlying data efficiently.
Basic Conversion
Pass the array directly to pd.Series():
import numpy as np
import pandas as pd
arr = np.array([10, 20, 30, 40])
series = pd.Series(arr)
print(series)
Output:
0 10
1 20
2 30
3 40
dtype: int64
Adding Custom Index
Assign meaningful labels instead of default integer indices:
import numpy as np
import pandas as pd
arr = np.array([10, 20, 30, 40])
# With string labels
series = pd.Series(arr, index=['Q1', 'Q2', 'Q3', 'Q4'])
print(series)
Output:
Q1 10
Q2 20
Q3 30
Q4 40
dtype: int64
Label-Based Access
import numpy as np
import pandas as pd
arr = np.array([100, 200, 300])
series = pd.Series(arr, index=['A', 'B', 'C'])
# Access by label
print(series['A']) # 100
print(series['B':'C']) # B: 200, C: 300
# Access by position still works
print(series.iloc[0]) # 100
Output:
100
B 200
C 300
dtype: int64
100
Adding a Name
Give the Series a descriptive name:
import numpy as np
import pandas as pd
arr = np.array([1500, 2300, 1800])
series = pd.Series(
arr,
index=['Jan', 'Feb', 'Mar'],
name='Monthly Sales'
)
print(series)
Output:
Jan 1500
Feb 2300
Mar 1800
Name: Monthly Sales, dtype: int6
tip
The name attribute becomes the column header when the Series is converted to a DataFrame.
Specifying Data Type
Control the Series dtype explicitly:
import numpy as np
import pandas as pd
arr = np.array([1, 2, 3])
# Convert to float
series_float = pd.Series(arr, dtype='float64')
print(series_float.dtype) # float64
# Convert to string
series_str = pd.Series(arr, dtype='str')
print(series_str)
Output:
float64
0 1
1 2
2 3
dtype: object
Why Use Series Over Arrays?
| Feature | NumPy Array | Pandas Series |
|---|---|---|
| Index type | Integer only | Any hashable labels |
| Missing data | Manual handling | Built-in NaN support |
| Alignment | By position | By label (automatic) |
| Statistics | Basic (mean, std) | Rich (describe, value_counts) |
| String methods | Limited | Full .str accessor |
| Datetime methods | Limited | Full .dt accessor |
Practical Comparison
import numpy as np
import pandas as pd
arr = np.array([10, 20, 30, 40, 50])
series = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])
# Rich statistics with Series
print(series.describe())
print()
# Value counts (useful for categorical data)
data = np.array(['A', 'B', 'A', 'C', 'A', 'B'])
print(pd.Series(data).value_counts())
Output:
count 5.000000
mean 30.000000
std 15.811388
min 10.000000
25% 20.000000
50% 30.000000
75% 40.000000
max 50.000000
dtype: float64
A 3
B 2
C 1
Name: count, dtype: int64
Memory Sharing Behavior
By default, Series may share memory with the source array:
import numpy as np
import pandas as pd
arr = np.array([1, 2, 3])
series = pd.Series(arr)
# Modifying array affects series (shared memory)
arr[0] = 99
print(series[0]) # 99
# Create independent copy
arr = np.array([1, 2, 3])
series = pd.Series(arr.copy())
# or: series = pd.Series(arr).copy()
arr[0] = 99
print(series[0]) # 1 (unchanged)
Output:
99
1
warning
If you modify the original array, the Series may change too. Use .copy() when you need independent data.
Converting Different Array Types
Multi-dimensional Arrays
Flatten first or select a column:
import numpy as np
import pandas as pd
arr_2d = np.array([[1, 2], [3, 4], [5, 6]])
# Flatten to 1D
series_flat = pd.Series(arr_2d.flatten())
print(series_flat) # 0:1, 1:2, 2:3, 3:4, 4:5, 5:6
print()
# Use specific column
series_col = pd.Series(arr_2d[:, 0]) # First column
print(series_col) # 0:1, 1:3, 2:5
Output:
0 1
1 2
2 3
3 4
4 5
5 6
dtype: int64
0 1
1 3
2 5
dtype: int64
Datetime Arrays
import numpy as np
import pandas as pd
dates = np.array(['2024-01-01', '2024-02-01', '2024-03-01'], dtype='datetime64')
series = pd.Series(dates)
print(series.dt.month) # Access datetime properties
Output:
0 1
1 2
2 3
dtype: int32
Practical Examples
Time Series Data
import numpy as np
import pandas as pd
# Stock prices
prices = np.array([150.25, 152.30, 149.80, 155.00])
dates = pd.date_range('2024-01-01', periods=4)
stock = pd.Series(prices, index=dates, name='AAPL')
print(stock)
# Calculate returns
returns = stock.pct_change()
print(f"\nDaily returns:\n{returns}")
Output:
2024-01-01 150.25
2024-01-02 152.30
2024-01-03 149.80
2024-01-04 155.00
Freq: D, Name: AAPL, dtype: float64
Daily returns:
2024-01-01 NaN
2024-01-02 0.013644
2024-01-03 -0.016415
2024-01-04 0.034713
Freq: D, Name: AAPL, dtype: float64
Categorical Data with Counts
import numpy as np
import pandas as pd
# Survey responses
responses = np.array(['Yes', 'No', 'Yes', 'Yes', 'No', 'Maybe'])
series = pd.Series(responses)
# Frequency analysis
print(series.value_counts())
print(f"\nMode: {series.mode()[0]}")
Output:
Yes 3
No 2
Maybe 1
Name: count, dtype: int64
Mode: Yes
Handling Missing Data
import numpy as np
import pandas as pd
arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
series = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])
# Built-in missing data methods
print(f"Count (non-null): {series.count()}") # 3
print(f"Mean (ignores NaN): {series.mean()}") # 3.0
print(f"Filled:\n{series.fillna(0)}")
Output:
import numpy as np
import pandas as pd
arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
series = pd.Series(arr, index=['a', 'b', 'c', 'd', 'e'])
# Built-in missing data methods
print(f"Count (non-null): {series.count()}") # 3
print(f"Mean (ignores NaN): {series.mean()}") # 3.0
print(f"Filled:\n{series.fillna(0)}")
Quick Reference
| Goal | Code |
|---|---|
| Basic conversion | pd.Series(arr) |
| With labels | pd.Series(arr, index=['a', 'b']) |
| With name | pd.Series(arr, name='Sales') |
| With dtype | pd.Series(arr, dtype='float64') |
| Independent copy | pd.Series(arr).copy() |
| From 2D column | pd.Series(arr_2d[:, 0]) |
Summary
- Use
pd.Series(arr)for quick conversion from NumPy arrays. - Add
indexfor labeled access andnamefor descriptive identification. - The Series provides richer functionality than arrays-including automatic alignment, missing data handling, and comprehensive statistical methods-making it ideal for data analysis tasks.