How to Make a Pandas DataFrame From a Two-Dimensional List in Python
When working with tabular data in Python, you often start with a two-dimensional list: a list of lists where each inner list represents a row of data. Converting this into a Pandas DataFrame gives you access to powerful data manipulation tools: column labels, filtering, sorting, aggregation, and much more.
In this guide, you will learn multiple ways to create a DataFrame from a 2D list, including how to assign column names, control data types, and choose the right method for your use case.
Using pd.DataFrame(): The Standard Approach
The most straightforward way to create a DataFrame from a 2D list is by passing it directly to the pd.DataFrame() constructor. Each inner list becomes a row, and you can optionally specify column names:
import pandas as pd
data = [
["Alice", 25],
["Bob", 30],
["Charlie", 26],
["Diana", 22]
]
df = pd.DataFrame(data, columns=["Name", "Age"])
print(df)
Output:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 26
3 Diana 22
If you omit the columns parameter, Pandas assigns default integer column names (0, 1, 2, ...):
import pandas as pd
data = [
["Alice", 25],
["Bob", 30],
["Charlie", 26],
["Diana", 22]
]
df = pd.DataFrame(data)
print(df)
Output:
0 1
0 Alice 25
1 Bob 30
2 Charlie 26
3 Diana 22
Always provide meaningful column names when creating a DataFrame. It makes your code more readable and your data easier to query:
# ✅ Clear and self-documenting
df = pd.DataFrame(data, columns=["Name", "Age"])
df[df["Age"] > 25]
# ❌ Unclear: what does column 1 represent?
df = pd.DataFrame(data)
df[df[1] > 25]
Using pd.DataFrame.from_records()
The from_records() method is designed for structured, record-oriented data where each inner list represents a complete record (row). It behaves similarly to the standard constructor but explicitly signals that your data is in a records format:
import pandas as pd
data = [
["Dev1", 28, "Analyst"],
["Dev2", 35, "Manager"],
["Dev3", 29, "Developer"]
]
df = pd.DataFrame.from_records(data, columns=["Name", "Age", "Occupation"])
print(df)
Output:
Name Age Occupation
0 Dev1 28 Analyst
1 Dev2 35 Manager
2 Dev3 29 Developer
When to Use from_records()
from_records() is particularly useful when your data comes from sources that produce records, such as database queries, CSV readers, or API responses that return lists of tuples:
import pandas as pd
# Works with tuples too
records = [
("Alice", 90),
("Bob", 85),
("Charlie", 92)
]
df = pd.DataFrame.from_records(records, columns=["Student", "Score"])
print(df)
Output:
Student Score
0 Alice 90
1 Bob 85
2 Charlie 92
Using pd.DataFrame.from_dict() With Transposition
The from_dict() method creates a DataFrame from a dictionary. To use it with a 2D list, you first need to transpose the data (convert rows into columns) using zip(*data):
import pandas as pd
data = [
["Dev1", 26, "Scientist"],
["Dev2", 31, "Researcher"],
["Dev3", 24, "Engineer"]
]
columns = ["Name", "Age", "Occupation"]
# zip(*data) transposes rows into columns
# dict(zip(...)) creates {column_name: column_values}
df = pd.DataFrame.from_dict(dict(zip(columns, zip(*data))))
print(df)
Output:
Name Age Occupation
0 Dev1 26 Scientist
1 Dev2 31 Researcher
2 Dev3 24 Engineer
How the Transposition Works
data = [["A", 1], ["B", 2], ["C", 3]]
# zip(*data) unpacks and transposes:
# Row-oriented: [["A", 1], ["B", 2], ["C", 3]]
# Column-oriented: [("A", "B", "C"), (1, 2, 3)]
transposed = list(zip(*data))
print(transposed)
Output:
[('A', 'B', 'C'), (1, 2, 3)]
This method is useful when you want to build a column-oriented dictionary from row-oriented data. However, for most cases, pd.DataFrame() is simpler and more readable.
Specifying Custom Index Values
By default, Pandas assigns a zero-based integer index. You can customize this with the index parameter:
import pandas as pd
data = [
["Alice", "Reacher", 25],
["Bob", "Pete", 30],
["Charlie", "Wilson", 26],
["Diana", "Williams", 22]
]
df = pd.DataFrame(
data,
columns=["FName", "LName", "Age"],
index=["emp1", "emp2", "emp3", "emp4"]
)
print(df)
Output:
FName LName Age
emp1 Alice Reacher 25
emp2 Bob Pete 30
emp3 Charlie Wilson 26
emp4 Diana Williams 22
Controlling Data Types
Pandas infers data types automatically, but you can control them explicitly using the dtype parameter or by calling astype() after creation:
import pandas as pd
data = [
["Product A", 10, 29.99],
["Product B", 5, 49.99],
["Product C", 20, 9.99]
]
df = pd.DataFrame(data, columns=["Product", "Quantity", "Price"])
# Check inferred types
print(df.dtypes)
print()
# Explicitly set Quantity to int32
df["Quantity"] = df["Quantity"].astype("int32")
print(df.dtypes)
Output:
Product object
Quantity int64
Price float64
dtype: object
Product object
Quantity int32
Price float64
dtype: object
Handling Uneven Row Lengths
If your inner lists have different lengths, Pandas fills missing values with NaN:
import pandas as pd
data = [
["Alice", 25, "New York"],
["Bob", 30], # Missing city
["Charlie", 26, "Houston", "Extra"] # Extra value
]
df = pd.DataFrame(data, columns=["Name", "Age", "City", "Other"])
print(df)
Output:
Name Age City Other
0 Alice 25 New York None
1 Bob 30 None None
2 Charlie 26 Houston Extra
While Pandas handles uneven rows gracefully by filling with None/NaN, this is often a sign of data quality issues. Validate your data before creating a DataFrame:
# ✅ Validate that all rows have the expected number of columns
expected_columns = 3
for i, row in enumerate(data):
if len(row) != expected_columns:
print(f"Row {i} has {len(row)} values, expected {expected_columns}")
Quick Comparison of Methods
| Method | Best For | Key Advantage |
|---|---|---|
pd.DataFrame(data) | General-purpose creation | Simple, intuitive, most commonly used |
pd.DataFrame.from_records(data) | Record/tuple-oriented data | Explicit intent for structured records |
pd.DataFrame.from_dict(dict) | Column-oriented dictionaries | Works naturally with dict-based data sources |
For the vast majority of cases, pd.DataFrame() is the recommended choice. Use from_records() or from_dict() when they better match the structure of your source data.
Complete Example
import pandas as pd
# Sample 2D list representing student records
students = [
["Alice", "Math", 92, "A"],
["Bob", "Science", 85, "B"],
["Charlie", "English", 78, "C+"],
["Diana", "Math", 95, "A+"],
["Eve", "Science", 88, "B+"]
]
# Create DataFrame with column names and custom index
df = pd.DataFrame(
students,
columns=["Name", "Subject", "Score", "Grade"],
index=[f"S{i+1}" for i in range(len(students))]
)
print("Student Records:")
print(df)
print(f"\nAverage Score: {df['Score'].mean():.1f}")
print(f"\nStudents with A grades:")
print(df[df["Grade"].str.startswith("A")])
Output:
Student Records:
Name Subject Score Grade
S1 Alice Math 92 A
S2 Bob Science 85 B
S3 Charlie English 78 C+
S4 Diana Math 95 A+
S5 Eve Science 88 B+
Average Score: 87.6
Students with A grades:
Name Subject Score Grade
S1 Alice Math 92 A
S4 Diana Math 95 A+
Conclusion
Creating a Pandas DataFrame from a two-dimensional list is one of the most fundamental operations in Python data analysis.
- The standard
pd.DataFrame()constructor handles most scenarios efficiently: just pass your 2D list and an optional list of column names. - For specialized cases,
from_records()works well with structured record data, andfrom_dict()is ideal when your data naturally maps to a dictionary format.
Regardless of which method you choose, always provide descriptive column names and validate your data's structure to ensure clean, reliable DataFrames.