Python Pandas: How to Create a Pandas Series from Lists, Dicts, and Arrays
A Pandas Series is the fundamental one-dimensional data structure in Pandas. Every column in a DataFrame is a Series, and understanding how to create them from different Python data sources is essential for effective data manipulation. Whether your data starts as a list, a dictionary, or a NumPy array, Pandas provides a consistent constructor that adapts to each input type.
In this guide, you will learn how to create Series from each of these sources, customize indices and data types, and understand important behavioral differences like memory sharing with NumPy arrays.
Creating a Series from a List
Lists are the most common starting point. Pandas assigns a sequential numeric index automatically:
import pandas as pd
fruits = ['Apple', 'Banana', 'Cherry']
s = pd.Series(fruits)
print(s)
Output:
0 Apple
1 Banana
2 Cherry
dtype: object
Adding a Custom Index
Replace the default numeric index with meaningful labels by passing an index parameter:
import pandas as pd
prices = [1.50, 0.75, 2.00]
s = pd.Series(prices, index=['Apple', 'Banana', 'Cherry'])
print(s)
print()
print(f"Apple costs: ${s['Apple']}")
Output:
Apple 1.50
Banana 0.75
Cherry 2.00
dtype: float64
Apple costs: $1.5
Custom indices let you access values by name instead of position, making code more readable and less prone to errors.
Creating a Series from a Dictionary
When creating a Series from a dictionary, the keys automatically become the index labels and the values become the data:
import pandas as pd
population = {
'Tokyo': 37,
'Delhi': 28,
'Shanghai': 26
}
s = pd.Series(population)
print(s)
Output:
Tokyo 37
Delhi 28
Shanghai 26
dtype: int64
This is the most natural way to create a labeled Series because the mapping between labels and values is explicit in the source data.
Accessing Values by Label
import pandas as pd
population = {
'Tokyo': 37,
'Delhi': 28,
'Shanghai': 26
}
s = pd.Series(population)
# Single value
print(s['Tokyo'])
# Multiple values
print(s[['Tokyo', 'Delhi']])
Output:
37
Tokyo 37
Delhi 28
dtype: int64
Creating a Series from a NumPy Array
NumPy arrays convert directly to Series, preserving the data type and enabling high-performance operations:
import pandas as pd
import numpy as np
arr = np.array([1.5, 2.5, 3.5, 4.5])
s = pd.Series(arr)
print(s)
Output:
0 1.5
1 2.5
2 3.5
3 4.5
dtype: float64
Understanding Memory Sharing
By default, a Series created from a NumPy array may share the same underlying memory. This means modifying the original array can unexpectedly change the Series:
import pandas as pd
import numpy as np
arr = np.array([1, 2, 3])
s = pd.Series(arr)
# Changing the array may affect the Series
arr[0] = 99
print(f"Series after modifying array: {s[0]}")
Output:
Series after modifying array: 99
To create an independent copy, use .copy() on the array:
import pandas as pd
import numpy as np
arr = np.array([1, 2, 3])
s = pd.Series(arr.copy())
# Now they are independent
arr[0] = 99
print(f"Series after modifying array: {s[0]}")
Output:
Series after modifying array: 1
Memory sharing behavior can vary across Pandas versions and data types. If your code modifies the source array after creating a Series, always use .copy() to avoid subtle bugs.
Creating a Series from a Scalar Value
A single scalar value can be broadcast across a provided index to create a Series where every element has the same value:
import pandas as pd
s = pd.Series(5, index=['a', 'b', 'c'])
print(s)
Output:
a 5
b 5
c 5
dtype: int64
This is useful for initializing a Series with a default value that you plan to update later.
When creating a Series from a scalar, the index parameter is required. Without it, Pandas would not know how many elements to create.
Specifying the Data Type
Pandas infers data types automatically, but you can override this with the dtype parameter:
import pandas as pd
# Force integers to be stored as floats
s_float = pd.Series([1, 2, 3], dtype='float64')
print(f"Type: {s_float.dtype}")
print(s_float)
print()
# Convert numeric strings to integers during creation
s_int = pd.Series(['1', '2', '3'], dtype='int64')
print(f"Type: {s_int.dtype}")
print(s_int)
Output:
Type: float64
0 1.0
1 2.0
2 3.0
dtype: float64
Type: int64
0 1
1 2
2 3
dtype: int64
Naming the Series
The name parameter adds an identifier to the Series. This name becomes the column header when the Series is added to a DataFrame:
import pandas as pd
s = pd.Series([100, 200, 300], index=['Jan', 'Feb', 'Mar'], name='Revenue')
print(s)
print()
# The name carries over when converting to a DataFrame
df = s.to_frame()
print(df)
Output:
Jan 100
Feb 200
Mar 300
Name: Revenue, dtype: int64
Revenue
Jan 100
Feb 200
Mar 300
Quick Reference
| Source | Default Index | Example |
|---|---|---|
| List | Auto (0, 1, 2...) | pd.Series([1, 2, 3]) |
| Dictionary | Keys as labels | pd.Series({'a': 1, 'b': 2}) |
| NumPy array | Auto (0, 1, 2...) | pd.Series(np.array([1, 2])) |
| Scalar | Must be provided | pd.Series(5, index=['a', 'b']) |
| Parameter | Purpose | Example |
|---|---|---|
index | Custom index labels | index=['x', 'y', 'z'] |
dtype | Force a specific type | dtype='float64' |
name | Series identifier | name='Revenue' |
copy | Force independent copy | copy=True |
- Use lists for simple sequences with automatic indexing.
- Use dictionaries when your data naturally has labels, since keys become the index automatically.
- Use NumPy arrays for high-performance numeric data, but be aware of memory sharing.
- Add custom indices, data types, and names as needed to make your Series self-documenting and ready for integration into larger DataFrames.