Skip to main content

Python Pandas: How to Skip Rows While Reading a CSV File Using Pandas

When reading CSV files, you often encounter rows that need to be excluded - header comments, metadata lines, blank rows, or data that doesn't meet certain criteria. The Pandas read_csv() function provides flexible parameters to skip rows during file loading, eliminating the need to clean them afterward. This guide covers all the common row-skipping techniques with practical examples and outputs.

Key Parameters for Skipping Rows

The read_csv() function has two primary parameters for skipping rows:

ParameterDescriptionAccepts
skiprowsRows to skip from the top of the fileInteger, list of integers, or callable function
skipfooterNumber of rows to skip from the bottom of the fileInteger (requires engine='python')

Sample CSV File

All examples reference a file called students.csv with the following content:

students.csv
Name,Age,City,Score
Alice,20,New York,88
Bob,22,Chicago,92
Charlie,21,Boston,78
Diana,23,Seattle,95
Eve,20,Austin,81
Frank,24,Denver,89
Grace,22,Miami,93

Loading it without skipping any rows:

import pandas as pd

df = pd.read_csv('students.csv')
print(df)

Output:

      Name  Age      City  Score
0 Alice 20 New York 88
1 Bob 22 Chicago 92
2 Charlie 21 Boston 78
3 Diana 23 Seattle 95
4 Eve 20 Austin 81
5 Frank 24 Denver 89
6 Grace 22 Miami 93

Method 1: Skip N Rows from the Top

Pass an integer to skiprows to skip that many rows from the beginning of the file. Note that this counts from row 0 (the header row), so skipping rows also removes the header:

import pandas as pd

# Skip the first 3 rows (including the header)
df = pd.read_csv('students.csv', skiprows=3)
print(df)

Output:

  Charlie  21   Boston  78
0 Diana 23 Seattle 95
1 Eve 20 Austin 81
2 Frank 24 Denver 89
3 Grace 22 Miami 93

The original header (Name, Age, City, Score) and the first two data rows are skipped. The third data row (Charlie) is incorrectly used as the header.

warning

When using skiprows with an integer, the header row (row 0) is also counted. If you want to keep the header and skip only data rows, use a list or range that excludes row 0 (see Method 3 below).

Method 2: Skip Rows at Specific Positions

Pass a list of row indices to skip specific rows. Row 0 is the header:

import pandas as pd

# Skip the header (row 0), row 2, and row 5
df = pd.read_csv('students.csv', skiprows=[2, 5])
print(df)

Output:

      Name  Age      City  Score
0 Alice 20 New York 88
1 Charlie 21 Boston 78
2 Diana 23 Seattle 95
3 Frank 24 Denver 89
4 Grace 22 Miami 93

Rows at positions 2 (Bob) and 5 (Eve) in the file are skipped. The header (row 0) is preserved because it is not in the skip list.

Method 3: Skip N Data Rows While Keeping the Header

To skip data rows but preserve the column names, create a list of row indices that starts at 1 (the first data row) instead of 0 (the header):

import pandas as pd

# Skip the first 2 data rows (rows 1 and 2), keep the header (row 0)
df = pd.read_csv('students.csv', skiprows=[1, 2])
print(df)

Output:

      Name  Age     City  Score
0 Charlie 21 Boston 78
1 Diana 23 Seattle 95
2 Eve 20 Austin 81
3 Frank 24 Denver 89
4 Grace 22 Miami 93

Using a range for more rows:

import pandas as pd

# Skip data rows 1 through 3, keep header
df = pd.read_csv('students.csv', skiprows=range(1, 4))
print(df)

Output:

    Name  Age     City  Score
0 Diana 23 Seattle 95
1 Eve 20 Austin 81
2 Frank 24 Denver 89
3 Grace 22 Miami 93
tip

Use skiprows=range(1, N+1) to skip the first N data rows while keeping the header intact. This is the most common pattern for skipping rows in practice.

Method 4: Skip Rows Based on a Condition (Callable)

The skiprows parameter also accepts a callable function that receives the row index and returns True to skip or False to keep. This enables conditional row skipping:

Skip Every 3rd Row

import pandas as pd

# Skip every row where the index is divisible by 3
df = pd.read_csv('students.csv', skiprows=lambda x: x % 3 == 0 and x != 0)
print(df)

Output:

    Name  Age      City  Score
0 Alice 20 New York 88
1 Bob 22 Chicago 92
2 Diana 23 Seattle 95
3 Eve 20 Austin 81
4 Grace 22 Miami 93

Rows at positions 3 (Charlie) and 6 (Frank) are skipped. The condition x != 0 preserves the header.

Skip Rows Based on Content (Two-Pass Approach)

The callable only receives the row index, not the row content. To skip rows based on their values, you need a two-pass approach:

import pandas as pd

# First pass: load everything
df = pd.read_csv('students.csv')

# Second pass: filter out rows where Score < 85
df = df[df['Score'] >= 85]
print(df.reset_index(drop=True))

Output:

    Name  Age      City  Score
0 Alice 20 New York 88
1 Bob 22 Chicago 92
2 Diana 23 Seattle 95
3 Frank 24 Denver 89
4 Grace 22 Miami 93

Method 5: Skip Rows from the End of the File

Use skipfooter to skip a specified number of rows from the bottom of the file. This requires setting engine='python':

import pandas as pd

# Skip the last 3 rows
df = pd.read_csv('students.csv', skipfooter=3, engine='python')
print(df)

Output:

      Name  Age      City  Score
0 Alice 20 New York 88
1 Bob 22 Chicago 92
2 Charlie 21 Boston 78
3 Diana 23 Seattle 95

The last three rows (Eve, Frank, Grace) are excluded.

info

skipfooter requires engine='python' because the default C engine does not support this parameter. The Python engine is slower for large files, so use skipfooter only when necessary.

Combining skiprows and nrows

You can combine skiprows with nrows to read a specific window of rows from the file:

import pandas as pd

# Skip the first 2 data rows, then read only the next 3 rows
df = pd.read_csv('students.csv', skiprows=range(1, 3), nrows=3)
print(df)

Output:

      Name  Age     City  Score
0 Charlie 21 Boston 78
1 Diana 23 Seattle 95
2 Eve 20 Austin 81

This is useful for reading specific sections of very large files without loading everything into memory.

Common Mistake: Accidentally Skipping the Header

A frequent error is using skiprows=1 intending to skip the first data row, but actually skipping the header:

import pandas as pd

# WRONG: skiprows=1 skips the header row (row 0)
df = pd.read_csv('students.csv', skiprows=1)
print(df)

Output:

     Alice  20 New York  88
0 Bob 22 Chicago 92
1 Charlie 21 Boston 78
2 Diana 23 Seattle 95
3 Eve 20 Austin 81
4 Frank 24 Denver 89
5 Grace 22 Miami 93
...

The first data row (Alice) is now incorrectly used as the header.

The correct approach:

to skip the first data row while keeping the header:

import pandas as pd

# CORRECT: skip row 1 (first data row), keep row 0 (header)
df = pd.read_csv('students.csv', skiprows=[1])
print(df)

Output:

      Name  Age     City  Score
0 Bob 22 Chicago 92
1 Charlie 21 Boston 78
2 Diana 23 Seattle 95
3 Eve 20 Austin 81
4 Frank 24 Denver 89
5 Grace 22 Miami 93
danger

When skiprows is an integer (e.g., skiprows=2), it skips the first 2 rows starting from row 0, which includes the header. When skiprows is a list (e.g., skiprows=[2]), it skips only the specific row at that position. Always use a list when you want to preserve the header.

Quick Reference

GoalCode
Skip first N rows (including header)pd.read_csv('file.csv', skiprows=N)
Skip first N data rows (keep header)pd.read_csv('file.csv', skiprows=range(1, N+1))
Skip specific rows by positionpd.read_csv('file.csv', skiprows=[2, 5, 8])
Skip rows based on a conditionpd.read_csv('file.csv', skiprows=lambda x: condition)
Skip last N rowspd.read_csv('file.csv', skipfooter=N, engine='python')
Read a specific window of rowspd.read_csv('file.csv', skiprows=range(1, 4), nrows=5)

The skiprows and skipfooter parameters give you precise control over which rows are loaded from a CSV file. Whether you need to skip metadata headers, remove specific records, or apply conditional filtering during loading, these parameters help you load only the data you need without post-processing.