Python raises MemoryError when it canβt allocate more memory. This usually means your script is trying to load more data into RAM than your system has available.
What causes this error
- Loading a huge file into memory at once β reading a multi-GB CSV or JSON file with
open().read()orpd.read_csv()without chunking - Creating a massive list or dictionary β generating millions of objects in a loop
- Memory leak β objects that should be garbage collected are still referenced
Fix 1: Process data in chunks
Instead of loading everything at once:
# β This loads the entire file into memory
data = pd.read_csv('huge_file.csv')
# β
Process in chunks
for chunk in pd.read_csv('huge_file.csv', chunksize=10000):
process(chunk)
For plain files:
# β Loads entire file
content = open('huge.txt').read()
# β
Read line by line
with open('huge.txt') as f:
for line in f:
process(line)
Fix 2: Use generators instead of lists
# β Creates a list of 100M items in memory
squares = [x**2 for x in range(100_000_000)]
# β
Generator β computes one at a time
squares = (x**2 for x in range(100_000_000))
Fix 3: Find the memory leak
import tracemalloc
tracemalloc.start()
# ... your code ...
snapshot = tracemalloc.take_snapshot()
for stat in snapshot.statistics('lineno')[:10]:
print(stat)
This shows which lines allocate the most memory.
Also related: Python cheat sheet for quick syntax reference.
If you genuinely need more RAM:
- Use a machine with more memory
- On Linux, add swap space:
sudo fallocate -l 4G /swapfile - Use 64-bit Python (32-bit is limited to ~2GB)
How to prevent it
- Always use chunked reading for files over 100MB
- Prefer generators over list comprehensions for large datasets
- Profile memory usage with
tracemallocormemory_profilerduring development - Consider using
numpyarrays instead of Python lists (10x more memory efficient for numbers)
Related fixes: Python AttributeError: NoneType Β· Python TypeError Β· Python ValueError
Related: Pip Install Error Fix