Skip to main content

How to Count Characters in a File in Batch Script

Whether you are validating that a configuration setting does not exceed a specific length (like a BIOS password or a license key) or simply auditing the size of your log files, character counting is a fundamental task. While Windows has the find command to count lines, counting the exact number of characters requires a bit more logic.

In this guide, we will demonstrate how to count characters using both the quick file size method and a detailed character-by-character loop.

Method 1: The "File Size" Shortcut (Fastest)

If your file is plain text encoded in a single-byte character set (ASCII or ANSI), the file size in bytes closely corresponds to the number of characters. Line breaks (\r\n) contribute 2 bytes per line. This is the most efficient method for large files when an approximate count is acceptable.

Implementation Script

@echo off
setlocal

set "target=Log.txt"

:: Verify file exists
if not exist "%target%" (
echo [ERROR] File "%target%" not found.
pause
exit /b 1
)

:: Use the ~z modifier to get the file size in bytes
for %%A in ("%target%") do set "bytes=%%~zA"

:: Count lines to calculate newline overhead
for /f %%C in ('find /c /v "" ^< "%target%"') do set "lines=%%C"

:: Each line break is 2 bytes (CR+LF) in Windows text files
:: The last line may or may not have a trailing newline
set /a "newlineBytes=lines * 2"
set /a "textChars=bytes - newlineBytes"

:: Guard against negative result if file has no trailing newline
if %textChars% lss 0 set "textChars=0"

echo File: "%target%"
echo Total bytes: %bytes%
echo Line breaks (estimated^): %lines%
echo Text characters (approx^): %textChars%
pause
exit /b 0
info

This method provides an approximation, not an exact character count. It assumes single-byte encoding (1 byte = 1 character) and standard Windows CRLF line endings (2 bytes each). For UTF-8 files with non-ASCII characters (which use 2–4 bytes per character) or UTF-16 files (which use 2 bytes per character plus a BOM), the byte count will not match the character count. Use Method 3 for encoding-aware counting.

Method 2: Detailed String Counting (FOR Loop)

If you need to count characters excluding line breaks, or need an exact count that is not affected by file encoding assumptions, you can iterate through each line and count its characters.

Implementation Script

@echo off
setlocal disabledelayedexpansion

set "target=config.ini"
set "totalChars=0"
set "lineCount=0"

:: Verify file exists
if not exist "%target%" (
echo [ERROR] File "%target%" not found.
pause
exit /b 1
)

echo Counting characters in "%target%"...

:: Read file line by line
for /f "usebackq delims=" %%L in ("%target%") do (
set "line=%%L"
setlocal enabledelayedexpansion

:: Call a sub-routine to count characters without breaking the FOR loop
call :getLength "!line!" n

:: Pass the counts back to the parent scope
for /f "tokens=1,2" %%A in ("!n! !lineCount!") do (
endlocal
set /a "lineCount=%%B + 1"
set /a "totalChars+=%%A"
)
)

echo.
echo ==========================================
echo Lines counted: %lineCount%
echo Total text characters: %totalChars%
echo ==========================================
pause
exit /b 0

:: Function to calculate string length efficiently
:getLength
set "s=%~1"
set "len=0"
for %%P in (1024 512 256 128 64 32 16 8 4 2 1) do (
if "!s:~%%P,1!" neq "" (
set /a "len+=%%P"
set "s=!s:~%%P!"
)
)
if "!s:~0,1!" neq "" set /a "len+=1"
set "%~2=%len%"
exit /b
warning

The for /f loop skips blank lines by design and also skips lines beginning with ; (the default eol character). Any characters on those lines will not be counted. For files where every line must be included in the count, use the PowerShell method (Method 3) instead.

tip

The script uses the delayed expansion toggle pattern: each line is set with delayed expansion disabled (set "line=%%A") to preserve literal ! characters in the file content. The character-counting loop then runs with delayed expansion enabled, using set "tmp=!tmp:~1!" to remove one character at a time from the front of the string until the string is empty.

PowerShell can count characters, lines, and words accurately in one command, handling all encodings and edge cases correctly.

Implementation Script

@echo off
setlocal

set "target=file.txt"

:: Verify file exists
if not exist "%target%" (
echo [ERROR] File "%target%" not found.
pause
exit /b 1
)

echo Counting characters in "%target%"...

:: Measure-Object -Character counts all characters per line
:: Adding -Line and -Word provides a complete summary
powershell -NoProfile -Command ^
"$stats = Get-Content -Path '%target%' | Measure-Object -Character -Word -Line; " ^
"Write-Host (' Lines: ' + $stats.Lines); " ^
"Write-Host (' Words: ' + $stats.Words); " ^
"Write-Host (' Characters: ' + $stats.Characters)"

if %errorlevel% neq 0 (
echo [ERROR] Character count failed.
pause
exit /b 1
)
pause
exit /b 0
info

The Measure-Object -Character cmdlet counts characters as read by Get-Content, which handles encoding detection automatically. It counts all visible characters and spaces on each line but excludes the line-ending characters themselves. This provides a true text-character count regardless of whether the file uses ASCII, UTF-8, or UTF-16 encoding.

Why Count Characters in Batch?

  1. Input Validation: Ensuring that a user-provided description does not exceed the character limit of a database field.
  2. Size Monitoring: Checking if a log file has grown significantly larger than its usual baseline character count.
  3. Data Scrubbing: Identifying files that technically contain line breaks (2 bytes each) but no actual text characters.

Best Practices

  1. Choose the Right Method: Use Method 1 for quick byte-based estimates on single-byte encoded files. Use Method 2 when you need an exact count in a pure Batch environment. Use Method 3 for production work where accuracy, encoding awareness, and performance all matter.
  2. Account for Newlines: In Windows text files, every line break consists of 2 hidden characters (\r\n). Decide whether these invisible characters should count toward your total based on your use case. Methods 2 and 3 exclude them by default.
  3. Encoding Awareness: If your file is saved in UTF-16, every character takes up 2 bytes (plus a 2-byte BOM header). Method 1 will report roughly double the number of perceived characters. If your file uses UTF-8 with non-ASCII characters, multi-byte sequences will inflate the byte count beyond the actual character count. Method 3 handles all encodings correctly.
  4. Verify Source File: Always check that the input file exists before processing. Method 1 will silently report an empty or zero size, and Method 2 will produce a zero count with no error, both of which can be mistaken for valid results on an empty file.
  5. Performance: Method 1 is instantaneous even for multi-gigabyte files. Method 2 is extremely slow for files larger than a few kilobytes because it processes every character individually in a goto loop. Always prefer Method 1 or Method 3 for production work.
  6. Blank Lines: Method 2 skips blank lines due to for /f behavior, so characters on adjacent non-blank lines are counted correctly but the blank lines themselves are not counted at all. Method 3 includes all lines in its count.

Conclusion

Counting characters is a versatile task that ranges from quick file audits to precise string validation. By choosing between the high-performance file size method, the detailed string loop, and the encoding-aware PowerShell bridge, you can build scripts that handle text data with the appropriate level of precision. This granularity is essential for ensuring that your automation tools remain robust and compatible with the systems they manage.