How to Flatten a List of DataFrames in Python

When working with pandas in Python, you frequently encounter scenarios where data is stored as a list of DataFrames - for example, when reading multiple CSV files, splitting data by groups, or collecting results from batch operations. Flattening these into a single DataFrame makes the data easier to analyze, query, and export.

This guide covers how to flatten (combine) a list of separate DataFrames into one, as well as how to flatten list-like values stored within DataFrame columns.

Combining a List of DataFrames into One

The most common "flattening" task is merging multiple DataFrames into a single DataFrame. The pd.concat() function is the standard and most efficient way to do this.

Using `pd.concat()` (Recommended)

import pandas as pd

# Simulate a list of DataFrames (e.g., from reading multiple files)
df1 = pd.DataFrame({"name": ["Alice", "Bob"], "score": [85, 92]})
df2 = pd.DataFrame({"name": ["Charlie", "Diana"], "score": [78, 95]})
df3 = pd.DataFrame({"name": ["Eve"], "score": [88]})

list_of_dfs = [df1, df2, df3]

# Flatten into a single DataFrame
combined = pd.concat(list_of_dfs, ignore_index=True)
print(combined)

Output:

      name  score
  Alice     85
    Bob     92
Charlie     78
  Diana     95
    Eve     88

pd.concat(list_of_dfs) stacks all DataFrames vertically (row-wise) by default.
ignore_index=True resets the index to a continuous sequence (0, 1, 2, ...) instead of preserving original indices.

When to use ignore_index=True

If each DataFrame has its own index (e.g., all starting from 0), the combined DataFrame will have duplicate index values without ignore_index=True:

# Without ignore_index: duplicate indices
combined = pd.concat(list_of_dfs)
print(combined.index.tolist())
# Output: [0, 1, 0, 1, 0]

# With ignore_index: clean sequential index
combined = pd.concat(list_of_dfs, ignore_index=True)
print(combined.index.tolist())
# Output: [0, 1, 2, 3, 4]

Handling DataFrames with Different Columns

If the DataFrames have different columns, pd.concat() aligns them and fills missing values with NaN:

import pandas as pd

df1 = pd.DataFrame({"name": ["Alice"], "score": [85]})
df2 = pd.DataFrame({"name": ["Bob"], "grade": ["A"]})

combined = pd.concat([df1, df2], ignore_index=True)
print(combined)

Output:

    name  score grade
0  Alice   85.0   NaN
1    Bob    NaN     A

note

To only keep columns that appear in all DataFrames, use join="inner":

combined = pd.concat([df1, df2], ignore_index=True, join="inner")
print(combined)

Output:

    name
0  Alice
1    Bob

Flattening List Values Inside DataFrame Columns

A different type of flattening involves expanding list-like values stored within a column so that each element gets its own row.

Using `explode()` (Recommended)

The explode() method (available in pandas 0.25+) is the simplest way to flatten a column containing lists:

import pandas as pd

df = pd.DataFrame({
    "payments": [[300, 400, 500, 600], [300, 322, 333, 233]],
    "name": ["sravan", "bobby"]
})

print("Original DataFrame:")
print(df)
print()

# Flatten the 'payments' column
flattened = df.explode("payments", ignore_index=True)

print("Flattened DataFrame:")
print(flattened)

Output:

Original DataFrame:
               payments    name
[300, 400, 500, 600]  sravan
[300, 322, 333, 233]   bobby

Flattened DataFrame:
  payments    name
    300  sravan
    400  sravan
    500  sravan
    600  sravan
    300   bobby
    322   bobby
    333   bobby
    233   bobby

Each element in the payments list gets its own row, and the corresponding name value is repeated for each entry.

Exploding Multiple Columns

Starting with pandas 1.3.0, you can explode multiple columns simultaneously (they must have lists of the same length in each row):

import pandas as pd

df = pd.DataFrame({
    "months": [["Jan", "Feb", "Mar"], ["Jan", "Feb", "Mar"]],
    "payments": [[300, 400, 500], [100, 200, 150]],
    "name": ["Alice", "Bob"]
})

flattened = df.explode(["months", "payments"], ignore_index=True)
print(flattened)

Output:

  months payments   name
  Jan      300  Alice
  Feb      400  Alice
  Mar      500  Alice
  Jan      100    Bob
  Feb      200    Bob
  Mar      150    Bob

Manual Approach (Pre-pandas 0.25)

If you're working with an older version of pandas that doesn't have explode(), you can flatten list columns manually using iteration:

import pandas as pd

df = pd.DataFrame({
    "payments": [[300, 400, 500, 600], [300, 322, 333, 233]],
    "name": ["sravan", "bobby"]
})

# Manual flattening
flat_data = pd.DataFrame(
    [(idx, value) for idx, values in df["payments"].items() for value in values],
    columns=["index", "payments"]
).set_index("index")

result = df.drop("payments", axis=1).join(flat_data)
print(result)

Output:

     name  payments
sravan       300
sravan       400
sravan       500
sravan       600
 bobby       300
 bobby       322
 bobby       333
 bobby       233

warning

The manual approach is more verbose and harder to maintain. Use explode() whenever possible - it's faster, cleaner, and handles edge cases like empty lists and NaN values automatically.

Flattening a DataFrame's Values to a 1D Array

If you need to flatten an entire DataFrame into a one-dimensional array (all values in a single flat sequence), use NumPy's .flatten() on the underlying values:

import pandas as pd

df = pd.DataFrame({
    "A": [1, 2, 3],
    "B": [4, 5, 6]
})

flat_array = df.values.flatten()
print(flat_array)
print(type(flat_array))

Output:

[1 4 2 5 3 6]
<class 'numpy.ndarray'>

note

To control the flattening order:

# Row-major order (default): row by row
print(df.values.flatten(order="C"))  # [1 4 2 5 3 6]

# Column-major order: column by column
print(df.values.flatten(order="F"))  # [1 2 3 4 5 6]

Practical Example: Reading and Combining Multiple CSV Files

A real-world use case for flattening a list of DataFrames is loading multiple files into a single dataset:

import pandas as pd
from pathlib import Path

# Read all CSV files from a directory
data_dir = Path("data/reports/")
csv_files = list(data_dir.glob("*.csv"))

# Load each file into a DataFrame
dfs = [pd.read_csv(file) for file in csv_files]

# Flatten into a single DataFrame
combined = pd.concat(dfs, ignore_index=True)

print(f"Loaded {len(csv_files)} files with {len(combined)} total rows")
print(combined.head())

tip

For very large numbers of files, consider adding a source column to track which file each row came from:

dfs = []
for file in csv_files:
    df = pd.read_csv(file)
    df["source_file"] = file.name
    dfs.append(df)

combined = pd.concat(dfs, ignore_index=True)

Quick Reference

Task	Method	Best For
Combine list of DataFrames	`pd.concat(dfs, ignore_index=True)`	Merging multiple DataFrames
Flatten list column to rows	`df.explode("column")`	Expanding list values (pandas 0.25+)
Flatten multiple list columns	`df.explode(["col1", "col2"])`	Parallel list expansion (pandas 1.3+)
Flatten all values to 1D array	`df.values.flatten()`	Converting entire DataFrame to array

Conclusion

Flattening DataFrames in Python typically falls into two categories:

combining multiple DataFrames into one using pd.concat(),
or expanding list-like values within columns using explode().

Both operations are essential for data preparation and analysis. Use pd.concat() with ignore_index=True for clean merging, and explode() for transforming nested list data into a flat, row-per-value format that's ready for analysis.

Combining a List of DataFrames into One​

Using pd.concat() (Recommended)​

Handling DataFrames with Different Columns​

Flattening List Values Inside DataFrame Columns​

Using explode() (Recommended)​

Exploding Multiple Columns​

Manual Approach (Pre-pandas 0.25)​

Flattening a DataFrame's Values to a 1D Array​

Practical Example: Reading and Combining Multiple CSV Files​

Quick Reference​

Conclusion​

Table of Contents

Combining a List of DataFrames into One

Using `pd.concat()` (Recommended)

Handling DataFrames with Different Columns

Flattening List Values Inside DataFrame Columns

Using `explode()` (Recommended)

Exploding Multiple Columns

Manual Approach (Pre-pandas 0.25)

Flattening a DataFrame's Values to a 1D Array

Practical Example: Reading and Combining Multiple CSV Files

Quick Reference

Conclusion