Python Pandas: How to Convert TSV to Excel Using Pandas in Python
Tab-Separated Values (TSV) files are widely used in scientific computing, data exports, and technical workflows because tabs rarely conflict with actual data content the way commas often do in CSV files. However, when sharing data with stakeholders who prefer visual spreadsheet tools, converting TSV to Excel format (.xlsx) becomes essential.
In this guide, you will learn how to perform this conversion efficiently, from quick two-line solutions to streaming approaches for handling massive datasets that do not fit in memory.
Quick Conversion with Pandas
Pandas provides the most straightforward approach for converting TSV files to Excel. The read_csv() function handles TSV files when you specify the tab character as the separator, and to_excel() writes the result to an .xlsx file:
import pandas as pd
# Read TSV file (tab is specified as the separator)
df = pd.read_csv("data.tsv", sep="\t")
# Export to Excel format
df.to_excel("output.xlsx", index=False)
print(f"Converted {len(df)} rows to Excel successfully!")
index=False?By default, Pandas includes the DataFrame's row index (0, 1, 2...) as an extra column in the Excel output. Setting index=False produces a cleaner spreadsheet containing only your original data columns.
Customizing the Excel Output
Pandas offers several parameters to control how the Excel file is generated, including custom sheet names, frozen header rows, and handling of missing values:
import pandas as pd
df = pd.read_csv("sales_data.tsv", sep="\t")
df.to_excel(
"formatted_report.xlsx",
index=False,
sheet_name="Sales Report", # Custom sheet name instead of "Sheet1"
freeze_panes=(1, 0), # Freeze the header row for scrolling
na_rep="N/A" # Display missing values as "N/A"
)
The freeze_panes=(1, 0) parameter is particularly useful for large spreadsheets, as it keeps the header row visible while the recipient scrolls through the data.
Combining Multiple TSV Files into One Workbook
When you have several TSV files that belong together, such as quarterly reports, you can combine them into a single Excel workbook with each file on its own sheet:
import pandas as pd
from pathlib import Path
tsv_files = ["q1_data.tsv", "q2_data.tsv", "q3_data.tsv"]
with pd.ExcelWriter("quarterly_report.xlsx", engine="openpyxl") as writer:
for file in tsv_files:
df = pd.read_csv(file, sep="\t")
# Use the filename without extension as the sheet name
sheet_name = Path(file).stem
df.to_excel(writer, sheet_name=sheet_name, index=False)
print("All files combined into a single workbook!")
The pd.ExcelWriter context manager keeps the workbook open while you write multiple sheets, then saves and closes it automatically when the with block ends.
If your TSV contains international characters or special symbols, specify the encoding explicitly to prevent errors:
df = pd.read_csv("data.tsv", sep="\t", encoding="utf-8")
Common alternatives include "latin-1" for Western European data and "utf-16" for files exported from some Windows applications.
Streaming Approach for Large Files
For TSV files too large to fit in memory, loading the entire file into a Pandas DataFrame is not feasible. Instead, use XlsxWriter's constant memory mode to process the file row by row:
import csv
import xlsxwriter
# Create workbook with constant_memory mode for streaming writes
workbook = xlsxwriter.Workbook("large_output.xlsx", {"constant_memory": True})
worksheet = workbook.add_worksheet()
# Process the TSV line by line without loading the entire file
with open("massive_data.tsv", "r", encoding="utf-8") as tsv_file:
reader = csv.reader(tsv_file, delimiter="\t")
for row_num, row_data in enumerate(reader):
worksheet.write_row(row_num, 0, row_data)
workbook.close()
print("Large file converted successfully!")
The constant_memory=True option tells XlsxWriter to flush rows to disk as they are written rather than holding the entire spreadsheet in memory. This allows you to convert files of virtually any size.
Library Comparison
| Approach | Speed | Memory Usage | Best For |
|---|---|---|---|
| Pandas | Fast | Higher | Most conversions, data manipulation |
| Pandas + openpyxl | Fast | Higher | Multiple sheets, basic formatting |
| XlsxWriter streaming | Moderate | Low | Very large files, custom cell formatting |
Pandas requires an Excel engine to write .xlsx files. Install the necessary packages before running these examples:
pip install pandas openpyxl
For the streaming approach, install XlsxWriter instead:
pip install xlsxwriter
Reusable Conversion Function
Here is a versatile function that handles common conversion scenarios and can be integrated into any data pipeline:
import pandas as pd
from pathlib import Path
def tsv_to_excel(input_path, output_path=None, **kwargs):
"""
Convert a TSV file to Excel format.
Args:
input_path: Path to the TSV file.
output_path: Path for Excel output (auto-generated if None).
**kwargs: Additional arguments passed to to_excel().
Returns:
A summary string with the number of rows converted.
"""
input_path = Path(input_path)
if output_path is None:
output_path = input_path.with_suffix(".xlsx")
df = pd.read_csv(input_path, sep="\t", encoding="utf-8")
df.to_excel(output_path, index=False, **kwargs)
return f"Converted {len(df)} rows to {output_path}"
# Usage examples
print(tsv_to_excel("quarterly_sales.tsv"))
print(tsv_to_excel("report.tsv", "final_report.xlsx", sheet_name="Q4 Sales"))
Output:
Converted 1500 rows to quarterly_sales.xlsx
Converted 820 rows to final_report.xlsx
The **kwargs parameter passes any additional arguments directly to to_excel(), so you can use all the customization options like sheet_name, freeze_panes, and na_rep without modifying the function itself.
By selecting the appropriate method based on your file size and formatting requirements, you can efficiently automate TSV-to-Excel conversions in any data pipeline. Use Pandas for most conversions where data fits comfortably in memory, and switch to the XlsxWriter streaming approach when working with files that exceed available RAM.