Some links in this article are affiliate links. We earn a commission at no extra cost to you when you purchase through them. Full disclosure.
Your AI app is deployed. Users are hitting it. Everything’s great — until it isn’t. Maybe your model server ran out of memory. Maybe the API you depend on is down. Maybe your SSL cert expired. And you find out when a user tweets about it.
Don’t be that developer. Set up uptime monitoring in 10 minutes and know about problems before your users do.
UptimeRobot is the go-to for this. Free tier gives you 50 monitors with 5-minute checks — more than enough for most AI apps. Let me show you how to set it up properly for AI-specific endpoints.
Why Monitoring Matters for AI Apps
AI applications have unique failure modes that traditional apps don’t:
- Model servers crash when they run out of VRAM
- Cold starts can make endpoints timeout after idle periods
- API providers go down (yes, even OpenAI has outages)
- Latency spikes happen when models are loading or swapping
- Rate limits can silently degrade your app
You need monitoring that catches all of these — not just “is port 443 open.”
For patterns to handle these failures gracefully, see AI fallback patterns.
Getting Started
Create a free UptimeRobot account — no credit card needed:
The free plan includes:
- 50 monitors
- 5-minute check intervals
- Email alerts
- 2 months of log history
- Public status pages
Step 1: Add a Basic HTTP Monitor
Start with a simple health check for your AI endpoint:
- Click Add New Monitor
- Select HTTP(S) as the monitor type
- Enter your endpoint URL (e.g.,
https://your-app.railway.app/health) - Set Monitoring Interval to 5 minutes
- Click Create Monitor
UptimeRobot will ping your endpoint every 5 minutes and alert you if it returns anything other than a 2xx status code.
Step 2: Add a Keyword Monitor for AI Health
A basic HTTP check only tells you “the server responds.” For AI apps, you want to verify the model is actually loaded and working.
Create a keyword monitor:
- Add New Monitor → select HTTP(S)
- URL:
https://your-app.com/health(your health endpoint should return model status) - Enable Keyword Monitoring
- Set keyword type to Keyword Exists
- Keyword:
"model_loaded": true(or whatever your health endpoint returns)
Your health endpoint should look something like:
@app.get("/health")
async def health():
try:
# Actually test the model
response = await run_quick_inference("test")
return {"status": "healthy", "model_loaded": True, "latency_ms": response.time}
except Exception as e:
return JSONResponse(
status_code=503,
content={"status": "unhealthy", "model_loaded": False, "error": str(e)}
)
This catches cases where your server is up but the model crashed.
Step 3: Monitor External AI APIs
If your app depends on external APIs (DeepSeek, OpenAI, Anthropic), monitor those too:
- Add New Monitor → HTTP(S)
- URL:
https://api.deepseek.com/models(or the provider’s status endpoint) - Set expected status code to 200
When the upstream API goes down, you’ll know immediately and can trigger fallback patterns or notify users.
Step 4: Set Up Alert Contacts
Go to My Settings → Alert Contacts and add:
Email alerts (included free):
- Add your email address
- Set threshold: alert after 1 failed check (5 minutes of downtime)
Slack integration (free):
- Create a Slack webhook URL in your workspace
- Add it as an alert contact in UptimeRobot
- Choose which monitors trigger Slack alerts
Webhook alerts (for custom automation):
- Add a webhook URL as an alert contact
- UptimeRobot POSTs JSON with monitor details
- Use this to trigger auto-restart scripts, page on-call, or update status
Example webhook payload handling:
@app.post("/webhook/uptime-alert")
async def handle_uptime_alert(request: Request):
data = await request.json()
monitor_name = data.get("monitorFriendlyName")
alert_type = data.get("alertType") # 1 = down, 2 = up
if alert_type == 1:
# Service went down — trigger recovery
await restart_model_server()
await notify_team(f"⚠️ {monitor_name} went down, attempting recovery")
return {"received": True}
Step 5: Create a Status Page
UptimeRobot lets you create a public status page — great for transparency with users:
- Go to Status Pages → Add Status Page
- Pick a custom subdomain (e.g.,
status.yourapp.com) - Select which monitors to display
- Customize the look (logo, colors)
- Publish
Share this URL in your app’s footer or docs. Users check there instead of DMing you.
Step 6: Monitor Response Time
Enable response time tracking for your AI endpoints:
- Edit your monitor
- Note the Response Time graph that UptimeRobot auto-generates
- Set up an alert for slow responses: if response time > 10 seconds, alert
For AI endpoints, response time monitoring catches:
- Model swapping to CPU (sudden latency spike)
- Memory pressure (gradual slowdown)
- Cold starts after idle periods
For dealing with latency in production, check handling AI latency in user-facing apps.
Recommended Monitor Setup for AI Apps
Here’s what I monitor for a typical AI application:
| Monitor | Type | Interval | Alert After |
|---|---|---|---|
| Main API health | HTTP(S) + keyword | 5 min | 1 failure |
| Model inference test | HTTP(S) + keyword | 5 min | 2 failures |
| External API (DeepSeek) | HTTP(S) | 5 min | 1 failure |
| Database | Port monitor | 5 min | 1 failure |
| SSL certificate | HTTP(S) | 24 hrs | 7 days before expiry |
That’s 5 monitors out of your 50 free — one app fully covered.
For a complete production readiness checklist, see our AI app deployment checklist.
Pro Tip: Monitor Alerting for LLMs
If you’re running your own LLM infrastructure, add these specific monitors:
# VRAM usage endpoint (add to your model server)
GET /metrics → {"vram_used_gb": 38.2, "vram_total_gb": 80.0}
Set a keyword monitor: alert if vram_used_gb exceeds 90% of total. This catches OOM crashes before they happen.
For more on production LLM monitoring, read our LLM alerting in production guide.
Free vs Pro Plan
| Feature | Free | Pro ($7/mo) |
|---|---|---|
| Monitors | 50 | 100+ |
| Check interval | 5 min | 30 sec |
| Log history | 2 months | 12+ months |
| Status pages | 1 | Unlimited |
| SMS alerts | ❌ | ✅ |
The free plan is genuinely useful. Upgrade to Pro only when you need faster detection (30-second checks) or more monitors.
FAQ
Is 5-minute monitoring good enough for AI apps?
For most apps, yes. A 5-minute gap means worst-case, your users experience 5 minutes of downtime before you’re alerted. For mission-critical production apps, upgrade to Pro for 30-second checks. For side projects and demos, 5 minutes is fine.
Can UptimeRobot monitor websocket connections?
Not directly. For websocket-based AI chat apps, monitor the HTTP health endpoint instead. If your server is healthy enough to respond to HTTP, the websocket server is likely fine too. For deep websocket monitoring, you’d need a custom solution.
How do I avoid false alerts from slow AI responses?
Set your timeout higher than your model’s typical response time. If your AI endpoint takes 5-8 seconds normally, set the monitor timeout to 30 seconds. Also use “alert after 2 consecutive failures” to avoid one-off timeout alerts.
Can I monitor self-hosted Ollama instances?
Yes, as long as the Ollama API is accessible from the internet (or you use UptimeRobot’s Pro plan with private monitoring). Point a monitor at http://your-server:11434/api/tags — it should return a JSON list of models. If it doesn’t respond, your Ollama server is down.