Mar 20, 2026 · 5 min read

Last updated on May 15, 2026

Kubernetes OOMKilled — How to Fix It

Q: Fix 6: Scale horizontally instead of vertically

If your app handles concurrent requests, running more pods with less memory each is often better than one pod with lots of memory. # Scale out kubectl scale deployment my-app --replicas=3

Q: Fix 7: Optimize memory usage in your application

Before throwing more memory at the problem: - **Stream large files** instead of loading them entirely into memory - **Use pagination** for database queries instead of fetching all rows

State:       Terminated
Reason:      OOMKilled
Exit Code:   137

Your container exceeded its memory limit and Kubernetes killed it. Exit code 137 means the process received SIGKILL (128 + 9) — the kernel’s Out of Memory (OOM) killer terminated it because there was no memory left within the cgroup.

Why this happens

Every container in Kubernetes runs inside a cgroup with a memory ceiling. When the container’s memory usage hits the limits.memory value, the Linux kernel’s OOM killer immediately terminates the process. There’s no graceful shutdown — it’s an instant kill.

Common causes:

Memory leak — The application gradually consumes more memory until it hits the limit
Limit set too low — The application legitimately needs more memory than allocated
Traffic spike — More concurrent requests means more memory for buffers, connections, and in-flight data
JVM/Node.js heap misconfiguration — The runtime’s heap size exceeds the container limit
Large file processing — Loading entire files into memory instead of streaming

Step 1: Diagnose the actual memory usage

Before changing limits, understand how much memory your pod actually needs.

# Current memory usage (requires metrics-server)
kubectl top pod my-pod --containers

# Historical usage — check what it was using before it died
kubectl describe pod my-pod | grep -A 10 "Last State"

# Check the OOM event
kubectl get events --field-selector involvedObject.name=my-pod --sort-by='.lastTimestamp'

# Get logs from the crashed container
kubectl logs my-pod --previous

If kubectl top shows memory climbing steadily over time, you likely have a memory leak. If it spikes suddenly, it’s probably a traffic or workload spike.

Fix 1: Increase the memory limit

The simplest fix — give the container more memory. But don’t just double it blindly. Set it based on observed peak usage plus a 20-30% buffer.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  template:
    spec:
      containers:
      - name: my-app
        resources:
          requests:
            memory: "256Mi"   # Scheduler uses this for placement
          limits:
            memory: "512Mi"   # Hard ceiling — OOMKilled if exceeded

How to choose the right value:

Run kubectl top pod during peak traffic for a few days
Note the maximum observed usage
Set the limit to 1.3× that value (30% headroom)
Set the request to the average usage

Fix 2: Fix the memory leak

If memory grows continuously until OOM, you have a leak. Common sources by language:

Node.js:

# Enable heap snapshots
kubectl exec my-pod -- node --inspect=0.0.0.0:9229 app.js

# Or add to deployment
env:
  - name: NODE_OPTIONS
    value: "--max-old-space-size=384 --expose-gc"

Common Node.js leaks: unclosed event listeners, growing arrays/maps that are never cleared, closures holding references to large objects, unresolved promises accumulating.

Python:

# Use tracemalloc to find leaks
import tracemalloc
tracemalloc.start()
# ... run your code ...
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
for stat in top_stats[:10]:
    print(stat)

Common Python leaks: global lists that grow forever, caching without eviction, circular references preventing garbage collection.

Go:

import _ "net/http/pprof"
// Then: kubectl port-forward my-pod 6060:6060
// Visit: http://localhost:6060/debug/pprof/heap

Common Go leaks: goroutines that never exit, growing slices that are never trimmed, sync.Pool misuse.

Fix 3: Configure JVM heap size (Java/Kotlin/Scala)

The JVM allocates heap memory independently of the container limit. If the JVM heap exceeds the container’s cgroup limit, OOMKilled happens.

containers:
- name: my-java-app
  resources:
    limits:
      memory: "1Gi"
  env:
    - name: JAVA_OPTS
      value: "-Xmx768m -Xms512m -XX:+UseContainerSupport"

Rules for JVM in containers:

Set -Xmx to ~75% of the container memory limit (the rest is for metaspace, thread stacks, native memory, and OS overhead)
Use -XX:+UseContainerSupport (default since Java 10) so the JVM respects cgroup limits
For Java 8, use -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap

Container limit	Recommended -Xmx
512Mi	384m
1Gi	768m
2Gi	1536m
4Gi	3072m

Fix 4: Configure Node.js memory limit

Node.js has its own heap limit (default ~1.5GB on 64-bit). In a container with less memory, you must cap it.

containers:
- name: my-node-app
  resources:
    limits:
      memory: "512Mi"
  env:
    - name: NODE_OPTIONS
      value: "--max-old-space-size=384"

Set --max-old-space-size to ~75% of the container limit (in MB). The remaining 25% covers the V8 new space, native addons, buffers, and OS overhead.

Fix 5: Use Guaranteed QoS class

Kubernetes has three Quality of Service classes. Setting requests equal to limits gives your pod “Guaranteed” QoS, meaning it’s the last to be evicted under node memory pressure.

resources:
  requests:
    memory: "512Mi"
  limits:
    memory: "512Mi"  # Same as request = Guaranteed QoS

QoS classes (from most to least protected):

Guaranteed — requests == limits for all containers
Burstable — at least one container has requests < limits
BestEffort — no requests or limits set (first to be killed)

Fix 6: Scale horizontally instead of vertically

If your app handles concurrent requests, running more pods with less memory each is often better than one pod with lots of memory.

# Scale out
kubectl scale deployment my-app --replicas=3

# Or use HPA (Horizontal Pod Autoscaler)
kubectl autoscale deployment my-app --min=2 --max=10 --cpu-percent=70

This works well for stateless web servers and API services. It doesn’t help for batch jobs or single-process workloads.

Fix 7: Optimize memory usage in your application

Before throwing more memory at the problem:

Stream large files instead of loading them entirely into memory
Use pagination for database queries instead of fetching all rows
Limit concurrency — fewer simultaneous requests = less memory
Clear caches — add TTL or LRU eviction to in-memory caches
Use connection pooling — reuse database connections instead of creating new ones

Debugging: OOMKilled vs eviction

There are two ways a pod can be killed for memory:

	OOMKilled	Eviction
Trigger	Container exceeds its own limit	Node runs out of memory
Exit code	137	Varies
Visible in	`kubectl describe pod` → Reason: OOMKilled	`kubectl get events` → Evicted
Fix	Increase container limit or reduce usage	Add more nodes or reduce cluster load

FAQ

My pod keeps getting OOMKilled during startup. What’s wrong?

Some applications (especially JVMs) need significant memory during initialization — loading classes, warming caches, building indexes. Set a higher limit or add a startup probe with a generous timeout so Kubernetes doesn’t kill it before it’s ready.

Can I get a warning before OOMKill happens?

Not directly from Kubernetes. But you can set up Prometheus alerts on container_memory_usage_bytes approaching container_spec_memory_limit_bytes. Alert at 80% to give yourself time to react.

Does OOMKilled count against my restart policy?

Yes. Each OOMKill increments the restart count. After repeated restarts, the pod enters CrashLoopBackOff with exponential backoff delays. See CrashLoopBackOff fix.

Should I set memory limits on all containers?

Yes. Without limits, a single container can consume all node memory and cause other pods to be evicted. Always set limits, even if generous. The only exception is development/test clusters where you want maximum flexibility.

Kubernetes OOMKilled — How to Fix It

Why this happens

Step 1: Diagnose the actual memory usage

Fix 1: Increase the memory limit

Fix 2: Fix the memory leak

Fix 3: Configure JVM heap size (Java/Kotlin/Scala)

Fix 4: Configure Node.js memory limit

Fix 5: Use Guaranteed QoS class

Fix 6: Scale horizontally instead of vertically

Fix 7: Optimize memory usage in your application

Debugging: OOMKilled vs eviction

FAQ

My pod keeps getting OOMKilled during startup. What’s wrong?

Can I get a warning before OOMKill happens?

Does OOMKilled count against my restart policy?

Should I set memory limits on all containers?

Related fixes

📬 AI Dev Weekly

You might also like

Docker Container Exits Immediately — How to Fix It

kubectl: Connection Refused — The Connection to the Server Was Refused

Kubernetes: CrashLoopBackOff — How to Fix It

Kubernetes: ImagePullBackOff — How to Fix It