Python Pandas: How to Copy a Pandas DataFrame Row to Multiple Other Rows
When working with pandas DataFrames, you'll often need to duplicate specific rows - whether for data augmentation, creating template records, filling in missing data patterns, or generating test datasets. Pandas provides several methods to copy a row and replicate it across multiple positions in a DataFrame.
In this guide, you'll learn multiple techniques to copy DataFrame rows, from simple single-row duplication to bulk replication, with clear examples and practical use cases.
Using loc[] and copy()
The most straightforward way to copy a row to specific positions is to select it with loc[] and assign it to new index positions using copy():
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [88, 92, 76]
})
print("Original:")
print(df)
# Copy row at index 1 (Bob) to new rows
df.loc[3] = df.loc[1].copy()
df.loc[4] = df.loc[1].copy()
print("\nAfter copying row 1 to rows 3 and 4:")
print(df)
Output:
Original:
Name Score
0 Alice 88
1 Bob 92
2 Charlie 76
After copying row 1 to rows 3 and 4:
Name Score
0 Alice 88
1 Bob 92
2 Charlie 76
3 Bob 92
4 Bob 92
copy()?When assigning via .loc, pandas extracts the values and writes them into the DataFrame. It does not store a reference to the original row object.
- For row duplication using
df.loc[new] = df.loc[existing]→copy()is not required. - For slices or chained operations → use
.copy()to avoid unintended side effects.
Using pd.concat() to Append Copied Rows
For copying a row multiple times and appending it to the DataFrame, pd.concat() is clean and efficient:
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [88, 92, 76]
})
# Select the row to copy
row_to_copy = df.loc[[1]] # Double brackets to keep as DataFrame
# Create multiple copies
copies = pd.concat([row_to_copy] * 3, ignore_index=True)
# Append to original DataFrame
df = pd.concat([df, copies], ignore_index=True)
print(df)
Output:
Name Score
0 Alice 88
1 Bob 92
2 Charlie 76
3 Bob 92
4 Bob 92
5 Bob 92
Using df.loc[[1]] (double brackets) returns a DataFrame instead of a Series, which is necessary for pd.concat() to work correctly. Single brackets (df.loc[1]) return a Series.
Using np.repeat() for Bulk Replication
NumPy's repeat() function is the most efficient method for replicating all rows a specified number of times:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [88, 92, 76]
})
# Replicate each row 3 times
df_replicated = pd.DataFrame(
np.repeat(df.values, repeats=3, axis=0),
columns=df.columns
)
print(df_replicated)
Output:
Name Score
0 Alice 88
1 Alice 88
2 Alice 88
3 Bob 92
4 Bob 92
5 Bob 92
6 Charlie 76
7 Charlie 76
8 Charlie 76
How it works:
df.valuesgets the underlying NumPy array.np.repeat(..., repeats=3, axis=0)replicates each row 3 times along the row axis.- The result is wrapped back into a DataFrame with the original column names.
np.repeat() is significantly faster than loop-based approaches for large DataFrames because it operates directly on the underlying array without Python-level iteration.
Replicating Rows with Different Counts
You can replicate each row a different number of times by passing a list to repeats:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [88, 92, 76]
})
# Replicate Alice 2 times, Bob 1 time, Charlie 3 times
counts = [2, 1, 3]
df_replicated = pd.DataFrame(
np.repeat(df.values, repeats=counts, axis=0),
columns=df.columns
)
print(df_replicated)
Output:
Name Score
0 Alice 88
1 Alice 88
2 Bob 92
3 Charlie 76
4 Charlie 76
5 Charlie 76
Using Index.repeat() for Type-Safe Replication
An alternative to np.repeat() that preserves DataFrame dtypes:
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [88, 92, 76]
})
# Replicate each row 3 times
df_replicated = df.loc[df.index.repeat(3)].reset_index(drop=True)
print(df_replicated)
print(f"\nDtypes:\n{df_replicated.dtypes}")
Output:
Name Score
0 Alice 88
1 Alice 88
2 Alice 88
3 Bob 92
4 Bob 92
5 Bob 92
6 Charlie 76
7 Charlie 76
8 Charlie 76
Dtypes:
Name object
Score int64
dtype: object
df.index.repeat() is often preferred over np.repeat() because it preserves the original DataFrame dtypes. The np.repeat() method can sometimes convert numeric columns to object dtype when mixed with string columns.
Copying a Row and Modifying Values
A common pattern is copying a row as a template and then changing specific values:
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Department': ['Engineering', 'Marketing'],
'Salary': [90000, 75000]
})
# Copy Alice's row as a template for a new employee
new_employee = df.loc[0].copy()
new_employee['Name'] = 'Charlie'
new_employee['Salary'] = 85000
df.loc[len(df)] = new_employee
print(df)
Output:
Name Department Salary
0 Alice Engineering 90000
1 Bob Marketing 75000
2 Charlie Engineering 85000
Charlie inherits Alice's department but has a different name and salary.
Copying Rows Between DataFrames
To copy rows from one DataFrame to another:
import pandas as pd
df_source = pd.DataFrame({
'Product': ['Laptop', 'Phone', 'Tablet'],
'Price': [999, 699, 329]
})
df_target = pd.DataFrame({
'Product': ['Monitor'],
'Price': [249]
})
# Copy rows 0 and 2 from source to target
rows_to_copy = df_source.loc[[0, 2]].copy()
df_target = pd.concat([df_target, rows_to_copy], ignore_index=True)
print(df_target)
Output:
Product Price
0 Monitor 249
1 Laptop 999
2 Tablet 329
Practical Example: Data Augmentation
A real-world use case is duplicating minority class samples for imbalanced datasets:
import pandas as pd
df = pd.DataFrame({
'Feature': [1.0, 2.0, 3.0, 4.0, 5.0],
'Label': ['A', 'A', 'A', 'B', 'B']
})
print("Before augmentation:")
print(df['Label'].value_counts())
# Duplicate minority class (B) rows to balance the dataset
minority = df[df['Label'] == 'B']
minority_copies = pd.concat([minority] * 2, ignore_index=True) # Triple the count
df_balanced = pd.concat([df, minority_copies], ignore_index=True)
print("\nAfter augmentation:")
print(df_balanced['Label'].value_counts())
print(f"\n{df_balanced}")
Output:
Before augmentation:
Label
A 3
B 2
Name: count, dtype: int64
After augmentation:
Label
B 6
A 3
Name: count, dtype: int64
Feature Label
0 1.0 A
1 2.0 A
2 3.0 A
3 4.0 B
4 5.0 B
5 4.0 B
6 5.0 B
7 4.0 B
8 5.0 B
Quick Comparison of Methods
| Method | Copies Specific Rows | Bulk Replication | Preserves Dtypes | Best For |
|---|---|---|---|---|
loc[] + copy() | ✅ | ❌ | ✅ | Copying 1-2 rows to specific positions |
pd.concat() | ✅ | ✅ | ✅ | Appending multiple copies |
np.repeat() | ❌ (all rows) | ✅ | 🔶 (may lose dtypes) | Fast bulk replication |
index.repeat() | ❌ (all rows) | ✅ | ✅ | Type-safe bulk replication |
Conclusion
Pandas offers several methods to copy rows, each suited to different scenarios:
- Use
loc[]withcopy()for copying individual rows to specific positions - simple and explicit. - Use
pd.concat()for appending multiple copies of rows to a DataFrame - flexible and clean. - Use
df.index.repeat()for replicating all rows efficiently while preserving data types - best for bulk operations. - Use
np.repeat()for maximum performance with large DataFrames when dtype preservation isn't critical.
Always use copy() when assigning rows to ensure independent duplicates, and prefer ignore_index=True with pd.concat() to maintain a clean sequential index.