🔧 Error Fixes
· 1 min read

Python UnicodeDecodeError: 'utf-8' Codec Can't Decode — How to Fix It


UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0

The file you’re reading isn’t UTF-8 encoded, but Python is trying to read it as UTF-8.

Fix 1: Specify the correct encoding

# ❌ Assumes UTF-8
with open("data.csv") as f:
    content = f.read()

# ✅ Try latin-1 (handles most Western European text)
with open("data.csv", encoding="latin-1") as f:
    content = f.read()

# ✅ Or Windows encoding
with open("data.csv", encoding="cp1252") as f:
    content = f.read()

Fix 2: Detect the encoding

pip install chardet
import chardet

with open("data.csv", "rb") as f:
    result = chardet.detect(f.read())
    print(result)  # {'encoding': 'ISO-8859-1', 'confidence': 0.73}

with open("data.csv", encoding=result["encoding"]) as f:
    content = f.read()

Fix 3: Ignore or replace bad characters

# Skip bad characters
with open("data.csv", encoding="utf-8", errors="ignore") as f:
    content = f.read()

# Replace bad characters with ?
with open("data.csv", encoding="utf-8", errors="replace") as f:
    content = f.read()

Fix 4: Read as binary

# If you don't need text (e.g., images, PDFs)
with open("file.bin", "rb") as f:
    data = f.read()
📘