How to Change the Command Prompt's Code Page in Batch Script
The Windows command prompt, by default, uses a legacy character encoding system known as a code page. A code page is a table of characters where each character is mapped to a specific number. This system can cause problems when your batch script needs to display or process text files containing special characters (like é, ü, ñ) or symbols (like € or █) that are not in the default code page.
This guide will teach you how to use the built-in CHCP (Change Code Page) command to change the active character encoding for your script. You will learn the most important code page numbers, including the one for UTF-8, and understand both the power and the limitations of this command.
What is a Code Page?
A code page is a "decoder ring" for text. A text file is just a series of numbers, and the code page tells the command prompt how to interpret those numbers and which character to draw on the screen.
- If you use the wrong code page, text can appear garbled (a phenomenon known as "mojibake").
- Code Page 437: The default OEM code page in the Americas.
- Code Page 1252: The default ANSI code page for Windows in Western languages.
- Code Page 65001: The code page for UTF-8, the modern, universal standard.
The Core Command: CHCP
The CHCP (Change Code Page) command is the standard utility for viewing and changing the active code page for the current command prompt session.
Syntax:
- To view the current code page:
CHCP - To change the code page:
CHCP <CodePageNumber>
Basic Example: A Simple Code Page Change
This script will display the default code page, change it, and then show the new active code page.
@ECHO OFF
ECHO --- Changing the Code Page ---
ECHO.
ECHO The current active code page is:
CHCP
ECHO.
ECHO Now, changing to Windows Latin 1 (1252)...
CHCP 1252
ECHO.
ECHO The new active code page is:
CHCP
Output:
--- Changing the Code Page ---
The current active code page is:
Active code page: 437
Now, changing to Windows Latin 1 (1252)...
Active code page: 1252
The new active code page is:
Active code page: 1252
The Most Important Code Page: 65001 (UTF-8)
For modern scripting, the most important code page to know is 65001, which corresponds to UTF-8. UTF-8 is the standard encoding for web pages, configuration files (like JSON), and cross-platform text files. It can represent characters from virtually every language in the world.
Switching to this code page allows your batch script to correctly display and process the contents of UTF-8 files.
@ECHO OFF
ECHO --- Switching to UTF-8 to display special characters ---
CHCP 65001 > NUL
REM Now, we can display a string with special characters.
ECHO The cost is €50 for the résumé.
For the special characters to display correctly, your .bat file itself must also be saved with UTF-8 encoding.
How to Save and Restore the Original Code Page
A well-behaved script should not permanently change the user's environment. The best practice is to save the original code page at the beginning of your script and restore it at the end.
@ECHO OFF
SETLOCAL
ECHO --- Saving and Restoring the Code Page ---
REM --- Save the original code page ---
FOR /F "tokens=2 delims=:." %%C IN ('CHCP') DO SET "OriginalCP=%%C"
ECHO Original code page is: %OriginalCP%
ECHO.
REM --- Change to UTF-8 for our script's logic ---
ECHO Changing to UTF-8 (65001)...
CHCP 65001 > NUL
ECHO The active code page is now 65001.
ECHO Script operations with special characters happen here...
ECHO.
REM --- Restore the original code page ---
ECHO Restoring original code page...
CHCP %OriginalCP% > NUL
ECHO Script finished. The code page is now back to %OriginalCP%.
ENDLOCAL
Common Pitfalls and How to Solve Them
Problem: The Font Doesn't Support the Characters
Even if you change the code page to UTF-8, you might see squares (□) or question marks (?) instead of the correct characters. This is a font issue, not an encoding issue. The default command prompt font, "Raster Fonts," is very old and has a very limited character set.
Solution: You must change the command prompt's font to one that supports a wider range of Unicode characters.
- Right-click the title bar of the
cmd.exewindow. - Go to Properties.
- Go to the Font tab.
- Choose a modern TrueType font like Consolas, Lucida Console, or SimSun.
Problem: The Change is Temporary
A change made with CHCP only lasts for the current command prompt session. If you close the window, it will revert to the system's default the next time you open one.
Solution: This is the desired behavior for scripts. It ensures that your script doesn't permanently alter the user's system. If you need to permanently change the default code page, it requires a registry modification (HKCU\Console).
Practical Example: A Script to Display a UTF-8 Text File
This is the most common use case. The script needs to correctly display the contents of a text file that was saved with UTF-8 encoding.
For example, consider this file report_utf8.txt:
Report for Café Très Bien
Status: Canceled (отменено)
Price: ¥1000
and consider this script: display_report.bat:
@ECHO OFF
SETLOCAL
REM Save original code page
FOR /F "tokens=2 delims=:." %%C IN ('CHCP') DO SET "OriginalCP=%%C"
REM Switch to UTF-8
CHCP 65001 > NUL
ECHO --- Displaying UTF-8 Report File ---
ECHO.
TYPE "report_utf8.txt"
ECHO.
ECHO --- End of Report ---
REM Restore original code page
CHCP %OriginalCP% > NUL
ENDLOCAL
Conclusion
The CHCP command is the essential tool for managing character encoding within a batch script, allowing you to work with a wide variety of international text files.
Key takeaways:
- Use
CHCPto view the active code page. - Use
CHCP <number>to change it. The most important code page is65001for UTF-8. - For correct display of many characters, you may need to change the console's font to a modern one like "Consolas".
- A robust script will save the original code page at the beginning and restore it before exiting.