Skip to main content

How to Display Unicode Characters in Batch Script

By default, the Windows Command Prompt uses a legacy "Code Page" (like 437) that only supports basic ASCII and extended ASCII characters. If you try to display Unicode characters, like check marks, mathematical symbols (Ξ£), or non-Latin characters (like accented letters or Chinese characters), they usually appear as garbled text or empty boxes.

In this guide, we will demonstrate how to change the console's character encoding to UTF-8 using the chcp command, allowing your Batch scripts to use modern symbols and support internationalization.

The Secret: CHCP 65001​

The chcp (Change Code Page) command is the key to Unicode support. 65001 is the specific Windows code page identifier for UTF-8.

Implementation Script​

@echo off
setlocal

:: 1. Save the current code page number so we can restore it later
set "oldCP="
for /F "tokens=2 delims=:" %%A in ('chcp') do (
for /F "tokens=*" %%B in ("%%A") do set "oldCP=%%B"
)

:: 2. Switch to UTF-8
chcp 65001 >nul

echo ==========================================
echo UNICODE SUPPORT IN BATCH
echo ==========================================
echo.
echo Symbols: βœ” βœ– ⚑ ⚠ βš™
echo Math: Ξ£ ∫ Ξ” Ο€
echo Accents: Γ‘ Γ© Γ­ Γ³ ΓΊ Γ±
echo Blocks: β–ˆ β–“ β–’ β–‘
echo.

:: 3. Restore the original code page before finishing
if defined oldCP chcp %oldCP% >nul

endlocal
pause

Essential Requirements for Unicode Display​

Switching the code page is only half the battle. For Unicode to actually look correct on the screen, your environment must meet two other criteria:

1. The Right Font​

You must use a TrueType or OpenType font in your console settings that contains the Unicode glyphs you want to display.

  • Recommended Fonts: Consolas, Lucida Console, or Cascadia Code (included with Windows Terminal).
  • Classic "Raster Fonts" will NOT work with Unicode, they only support the limited character set of the active code page.

To change the font: right-click the title bar of your cmd.exe window, select Properties, go to the Font tab, and choose a TrueType font.

2. Save Your Script File as UTF-8​

When you write your Batch script in a text editor like Notepad or VS Code, you must ensure the file itself is saved with UTF-8 encoding.

  • If the file is saved as "ANSI" (the default in older versions of Notepad), the Unicode symbols inside your script will be corrupted at the byte level before the chcp command even runs.
  • In VS Code, check the encoding indicator in the bottom-right status bar. Click it to change to "UTF-8."
  • In Notepad (Windows 10+), use File β†’ Save As and select "UTF-8" from the Encoding dropdown.
info

UTF-8 with or without BOM: A Byte Order Mark (BOM) is a special 3-byte prefix (EF BB BF) that some editors add to UTF-8 files. When a .bat file has a BOM, cmd.exe may display or process the BOM bytes as garbage characters at the very beginning of the script's output. Saving as UTF-8 without BOM is recommended for Batch scripts. VS Code saves without BOM by default. Notepad in Windows 10+ also saves without BOM by default, but older versions may add one.

Practical Uses for Unicode in Batch​

  1. Success/Failure Icons: Use βœ” and βœ– instead of [OK] and [FAIL].
  2. Status Gauges: Use the block characters β–ˆ and β–‘ to create higher-resolution progress bars.
  3. Modern UI Elements: Use arrows (β†’, ←) for navigation instructions in menus.
  4. Localization: Correctly display names with accents or non-English characters in pathnames or user reports.

Troubleshooting: Why do I see boxes or question marks?​

If you see β–‘ or ??? instead of the expected symbols:

  • Font limitation: Your chosen font doesn't contain that specific glyph. Try switching to Cascadia Code or Segoe UI Symbol which have broader Unicode coverage.
  • Editor encoding mismatch: You typed the symbols in your editor, but the editor saved the file using a different encoding (e.g., ANSI/Windows-1252 instead of UTF-8). Check your editor's encoding settings and re-save as UTF-8.
  • Code page not set: Ensure chcp 65001 is executed before any echo statements that contain Unicode characters. If the code page is still set to 437 or another legacy page when the echo runs, the bytes will be interpreted incorrectly.
  • Piping or redirection issues: When you redirect output to a file (script.bat > output.txt), the file will be encoded in UTF-8. Some legacy tools may not read UTF-8 files correctly.
warning

Redirection caution: When you pipe the output of a script running in chcp 65001 to a text file (myscript.bat > log.txt), the resulting file will be in UTF-8. Some legacy log analysis tools or scripts that expect ANSI-encoded input may not handle this correctly. If downstream tools require a specific encoding, restore the original code page before generating output for those tools.

Summary Checklist​

  1. Save the original code page: Capture the current code page number before changing it so you can restore it when the script finishes.
  2. Switch the code page: Use chcp 65001 >nul to enable UTF-8.
  3. Verify the font: Use a TrueType or OpenType font that includes the Unicode glyphs you need (Consolas, Cascadia Code, etc.).
  4. Save the script correctly: Ensure the .bat file is encoded as UTF-8 without BOM in your text editor.
  5. Restore the code page: Run chcp <original> >nul at the end of your script to leave the console in its previous state.

Conclusion​

Displaying Unicode characters brings your Batch scripts into the modern era of computing. It allows you to create more intuitive, visually rich interfaces that communicate status through symbols and support a global range of languages. With a single chcp command and the right file encoding, you can move past the limitations of legacy character sets and embrace a more expressive and professional command-line experience.