Skip to main content

How to Extract Columns from a Delimited File in Batch Script

Many data files, such as CSVs, log files, or server exports, are organized into columns separated by a delimiter like a comma (,), a semicolon (;), or a tab. Often, you only need one or two of these columns (e.g., just the "Email Address" from a user list). Column Extraction allows you to strip away redundant metadata and focus exclusively on the specific data points required for your next automation step.

In this guide, we will demonstrate how to use the for /f command to extract specific columns based on their index.

The Strategy: Tokens and Delims

The for /f command has two critical parameters for parsing columns:

  • delims: Defines what character separates your columns (e.g., delims=,).
  • tokens: Defines which column(s) you want to extract (e.g., tokens=2 for the second column).

Method 1: Simple CSV Extraction

Suppose you have a file users.csv with this format:

ID,Name,Email,Department
1001,Jane Smith,jane@example.com,Marketing
1002,Bob Jones,bob@example.com,Engineering
1003,Alice Park,alice@example.com,Finance

Implementation Script

@echo off
setlocal disabledelayedexpansion

set "Source=users.csv"
set "Output=EmailsOnly.txt"

:: Verify source file exists
if not exist "%Source%" (
echo [ERROR] Source file "%Source%" not found.
pause
exit /b 1
)

echo Extracting Column 3 (Emails) from "%Source%"...

:: 'skip=1' skips the header row
:: 'delims=,' tells Batch to split at every comma
:: 'tokens=3' captures the third segment
(
for /f "usebackq skip=1 tokens=3 delims=," %%A in ("%Source%") do (
set "col=%%A"
setlocal enabledelayedexpansion
echo(!col!
endlocal
)
) > "%Output%"

echo [SUCCESS] Email list saved to "%Output%".
pause
exit /b 0
tip

The skip=1 option is included to skip the header row (ID,Name,Email,Department). Without it, the word Email would appear as the first entry in the output file. Always use skip=1 when your delimited file has a header row.

Method 2: Extracting Multiple Specific Columns

You can extract multiple columns at once and rearrange them. For example, to get the ID and Email columns:

@echo off
setlocal disabledelayedexpansion

set "Source=users.csv"
set "Output=ID_Email_List.txt"

if not exist "%Source%" (
echo [ERROR] Source file "%Source%" not found.
pause
exit /b 1
)

echo Extracting ID and Email columns from "%Source%"...

:: tokens=1,3 captures column 1 into %%A and column 3 into %%B
:: Token variables are assigned to consecutive letters: %%A, %%B, %%C, etc.
(
for /f "usebackq skip=1 tokens=1,3 delims=," %%A in ("%Source%") do (
set "colA=%%A"
set "colB=%%B"
setlocal enabledelayedexpansion
echo(!colA!,!colB!
endlocal
)
) > "%Output%"

echo [SUCCESS] Extracted columns saved to "%Output%".
pause
exit /b 0
info

When you specify tokens=1,3, the for /f command assigns the first requested token to %%A and the next to %%B, regardless of the original column positions. The variable letters always advance sequentially from your starting letter, not from the token numbers.

Method 3: Extracting the Last Column with a Wildcard Token

When a file has a variable number of columns or you need everything from a certain column onward, the * wildcard token captures the remainder of the line.

@echo off
setlocal disabledelayedexpansion

set "Source=server_log.txt"

if not exist "%Source%" (
echo [ERROR] Source file "%Source%" not found.
pause
exit /b 1
)

echo Extracting timestamps and messages from "%Source%"...

:: tokens=1* captures column 1 into %%A and the entire remainder into %%B
:: This is useful when the last column may itself contain the delimiter
(
for /f "usebackq tokens=1* delims=," %%A in ("%Source%") do (
set "rest=%%B"
setlocal enabledelayedexpansion
echo(!rest!
endlocal
)
)

pause
exit /b 0
tip

The * wildcard in tokens=1* assigns everything after the first delimiter to %%B as a single unsplit string. This is essential when your last column may contain the delimiter character itself (e.g., a message field containing commas). Without *, the content after the first comma in that field would be split into additional tokens and partially lost.

Handling Missing Data (The "Empty Column" Trap)

One major limitation of for /f is that it treats multiple consecutive delimiters as a single one. If your CSV looks like John,,john@email.com (empty second column), Batch will skip the empty field and treat the third column (john@email.com) as the second token.

The Solution: PowerShell Bridge

If your file has empty columns or quoted fields containing the delimiter character, use PowerShell, which handles CSV formats correctly.

@echo off
setlocal

set "Source=data.csv"
set "Dest=emails.txt"

if not exist "%Source%" (
echo [ERROR] Source file "%Source%" not found.
pause
exit /b 1
)

echo Extracting Email column from "%Source%"...

:: Import-Csv correctly handles empty columns, quoted fields, and headers
powershell -NoProfile -Command ^
"Import-Csv -Path '%Source%' | " ^
"Select-Object -ExpandProperty 'Email' | " ^
"Set-Content -Path '%Dest%' -Encoding UTF8"

if %errorlevel% equ 0 (
echo [SUCCESS] Email list saved to "%Dest%".
) else (
echo [ERROR] Extraction failed.
pause
exit /b 1
)
pause
exit /b 0
warning

The for /f command cannot reliably parse CSV files that contain empty fields, quoted fields with embedded delimiters (e.g., "Doe, John"), or newlines within quoted values. These are all valid in the CSV standard (RFC 4180). If your data may contain any of these, use the PowerShell Import-Csv method, which handles the full CSV specification correctly.

Practical Uses for Column Extraction

  1. Inventory Management: Extracting only the "IP Address" column from a server list to feed into a ping-sweep script.
  2. Log Filtering: Getting just the "Timestamp" and "Error Code" from a complex server log.
  3. User Provisioning: Extracting "Username" and "Temporary Password" strings to generate automated welcome messages.

Best Practices

  1. Skip Headers: Use skip=1 in your for /f loop if your file has a header row that you do not want in the output.
  2. Verify Delimiters: If your data contains the delimiter character within the text (e.g., a comma inside a name like "Doe, John"), the Batch extraction will split the field incorrectly. Use the PowerShell method for such files.
  3. Verify Source File: Always check that the input file exists before processing. A missing file will cause the for /f loop to silently produce empty output.
  4. Token Limits: Batch can handle up to 31 tokens in a single for /f statement. For files with more than 31 columns, use PowerShell or extract in multiple passes.
  5. Consecutive Delimiters: The for /f command collapses consecutive delimiters into one. A line like A,,C parsed with delims=, produces only two tokens (A and C), not three. This silently shifts all subsequent column positions. If your data may have empty fields, use PowerShell.
  6. Special Characters: Use the delayed expansion toggle pattern when outputting extracted columns. Without it, column values containing &, |, >, <, or ! will corrupt the output or break the script.

Conclusion

Extracting columns is the first step in turning a heavy, multi-purpose file into a lean, targeted set of inputs. By mastering the tokens and delims logic of the for /f command, you transform the command line into a precise data-trimming tool, allowing your automation workflows to run faster and with much higher accuracy.