How to Resolve "LookupError: unknown encoding" in Python
The LookupError: unknown encoding error in Python occurs when you try to use an encoding that Python doesn't recognize. This typically happens when opening files, encoding/decoding strings, or configuring standard input/output.
This guide explains the causes of this error and provides solutions, including using valid encodings, setting environment variables, and reconfiguring sys.stdin and sys.stdout.
Understanding the Error: Invalid Encoding
The LookupError: unknown encoding error means you've specified an encoding name that Python's codec registry doesn't know. This most commonly happens in these situations:
- Opening files:
open('filename.txt', 'r', encoding='invalid-encoding') - Encoding/Decoding strings:
'my string'.encode('invalid-encoding')orb'my bytes'.decode('invalid-encoding') - Changing Standard Input/Output Encoding.
Example of the error:
# ⛔️ LookupError: unknown encoding: example
with open('example.txt', 'w', encoding='example') as my_file: # 'example' is invalid
my_file.write('first line' + '\n')
Using Valid Encodings
The most direct solution is to use a valid encoding. Here are some of the most common and recommended encodings:
utf-8: The most widely used encoding for Unicode text. It can represent virtually any character from any language. This is generally the best default choice.utf-8-sig: Same asutf-8, but it automatically handles the BOM (Byte Order Mark) if present at the beginning of a file. Use this when reading files that might have a BOM.latin-1(oriso-8859-1): A common encoding for Western European languages. It's a single-byte encoding, so it can't represent as many characters as UTF-8.ascii: A very basic encoding that only covers the standard English alphabet, numbers, and some punctuation. It's a subset of UTF-8. Use it only if you're certain your data contains only ASCII characters.utf-16andutf-32: Other Unicode encodings, less commonly used for file I/O than UTF-8.
Corrected Code Example:
# ✅ Specify 'utf-8' encoding
with open('example.txt', 'w', encoding='utf-8') as my_file:
my_file.write('first line' + '\n')
my_file.write('second line' + '\n')
my_file.write('third line' + '\n')
- This code uses the
utf-8encoding to encode the file.
Where to Find a List of Valid Encodings
Python has a comprehensive list of supported encodings. You can find it in the official documentation:
Setting the PYTHONIOENCODING Environment Variable
You can set the PYTHONIOENCODING environment variable to change the default encoding used for standard input, output, and error streams (stdin, stdout, stderr). This is useful if you're consistently working with a specific encoding and don't want to specify it in every open() call.
-
Linux/macOS:
export PYTHONIOENCODING=utf-8 -
Windows:
setx PYTHONIOENCODING utf-8
setx PYTHONLEGACYWINDOWSSTDIO utf-8 # Also required on some Windows versionsnoteSetting
PYTHONIOENCODINGaffects the default encoding. You can still override it within your Python code using theencodingargument in functions likeopen(). Also on Windows, you have to set upPYTHONLEGACYWINDOWSSTDIOto make sure that the default python encoding is used.
Reconfiguring sys.stdin, sys.stdout, and sys.stderr
In some situations, you might need to change the encoding of the standard input/output streams within your running Python script. You can do this using sys.stdin.reconfigure(), sys.stdout.reconfigure(), and sys.stderr.reconfigure() (available in Python 3.7+):
import sys
sys.stdin.reconfigure(encoding='utf-8')
sys.stdout.reconfigure(encoding='utf-8')
sys.stderr.reconfigure(encoding='utf-8')
- This code changes the encoding to UTF-8. Place this code at the very beginning of your script, before any other input/output operations. This is a relatively drastic measure and should only be used if you absolutely can not control the environment in which your script is run.