📝 Tutorials
· 5 min read

Monitor Your AI API Uptime with UptimeRobot (Free Plan)


Some links in this article are affiliate links. We earn a commission at no extra cost to you when you purchase through them. Full disclosure.

Your AI app is deployed. Users are hitting it. Everything’s great — until it isn’t. Maybe your model server ran out of memory. Maybe the API you depend on is down. Maybe your SSL cert expired. And you find out when a user tweets about it.

Don’t be that developer. Set up uptime monitoring in 10 minutes and know about problems before your users do.

UptimeRobot is the go-to for this. Free tier gives you 50 monitors with 5-minute checks — more than enough for most AI apps. Let me show you how to set it up properly for AI-specific endpoints.

Why Monitoring Matters for AI Apps

AI applications have unique failure modes that traditional apps don’t:

  • Model servers crash when they run out of VRAM
  • Cold starts can make endpoints timeout after idle periods
  • API providers go down (yes, even OpenAI has outages)
  • Latency spikes happen when models are loading or swapping
  • Rate limits can silently degrade your app

You need monitoring that catches all of these — not just “is port 443 open.”

For patterns to handle these failures gracefully, see AI fallback patterns.

Getting Started

Create a free UptimeRobot account — no credit card needed:

Start free on UptimeRobot

The free plan includes:

  • 50 monitors
  • 5-minute check intervals
  • Email alerts
  • 2 months of log history
  • Public status pages

Step 1: Add a Basic HTTP Monitor

Start with a simple health check for your AI endpoint:

  1. Click Add New Monitor
  2. Select HTTP(S) as the monitor type
  3. Enter your endpoint URL (e.g., https://your-app.railway.app/health)
  4. Set Monitoring Interval to 5 minutes
  5. Click Create Monitor

UptimeRobot will ping your endpoint every 5 minutes and alert you if it returns anything other than a 2xx status code.

Step 2: Add a Keyword Monitor for AI Health

A basic HTTP check only tells you “the server responds.” For AI apps, you want to verify the model is actually loaded and working.

Create a keyword monitor:

  1. Add New Monitor → select HTTP(S)
  2. URL: https://your-app.com/health (your health endpoint should return model status)
  3. Enable Keyword Monitoring
  4. Set keyword type to Keyword Exists
  5. Keyword: "model_loaded": true (or whatever your health endpoint returns)

Your health endpoint should look something like:

@app.get("/health")
async def health():
    try:
        # Actually test the model
        response = await run_quick_inference("test")
        return {"status": "healthy", "model_loaded": True, "latency_ms": response.time}
    except Exception as e:
        return JSONResponse(
            status_code=503,
            content={"status": "unhealthy", "model_loaded": False, "error": str(e)}
        )

This catches cases where your server is up but the model crashed.

Step 3: Monitor External AI APIs

If your app depends on external APIs (DeepSeek, OpenAI, Anthropic), monitor those too:

  1. Add New MonitorHTTP(S)
  2. URL: https://api.deepseek.com/models (or the provider’s status endpoint)
  3. Set expected status code to 200

When the upstream API goes down, you’ll know immediately and can trigger fallback patterns or notify users.

Step 4: Set Up Alert Contacts

Go to My SettingsAlert Contacts and add:

Email alerts (included free):

  • Add your email address
  • Set threshold: alert after 1 failed check (5 minutes of downtime)

Slack integration (free):

  1. Create a Slack webhook URL in your workspace
  2. Add it as an alert contact in UptimeRobot
  3. Choose which monitors trigger Slack alerts

Webhook alerts (for custom automation):

  1. Add a webhook URL as an alert contact
  2. UptimeRobot POSTs JSON with monitor details
  3. Use this to trigger auto-restart scripts, page on-call, or update status

Example webhook payload handling:

@app.post("/webhook/uptime-alert")
async def handle_uptime_alert(request: Request):
    data = await request.json()
    monitor_name = data.get("monitorFriendlyName")
    alert_type = data.get("alertType")  # 1 = down, 2 = up

    if alert_type == 1:
        # Service went down — trigger recovery
        await restart_model_server()
        await notify_team(f"⚠️ {monitor_name} went down, attempting recovery")

    return {"received": True}

Step 5: Create a Status Page

UptimeRobot lets you create a public status page — great for transparency with users:

  1. Go to Status PagesAdd Status Page
  2. Pick a custom subdomain (e.g., status.yourapp.com)
  3. Select which monitors to display
  4. Customize the look (logo, colors)
  5. Publish

Share this URL in your app’s footer or docs. Users check there instead of DMing you.

Step 6: Monitor Response Time

Enable response time tracking for your AI endpoints:

  1. Edit your monitor
  2. Note the Response Time graph that UptimeRobot auto-generates
  3. Set up an alert for slow responses: if response time > 10 seconds, alert

For AI endpoints, response time monitoring catches:

  • Model swapping to CPU (sudden latency spike)
  • Memory pressure (gradual slowdown)
  • Cold starts after idle periods

For dealing with latency in production, check handling AI latency in user-facing apps.

Here’s what I monitor for a typical AI application:

MonitorTypeIntervalAlert After
Main API healthHTTP(S) + keyword5 min1 failure
Model inference testHTTP(S) + keyword5 min2 failures
External API (DeepSeek)HTTP(S)5 min1 failure
DatabasePort monitor5 min1 failure
SSL certificateHTTP(S)24 hrs7 days before expiry

That’s 5 monitors out of your 50 free — one app fully covered.

For a complete production readiness checklist, see our AI app deployment checklist.

Pro Tip: Monitor Alerting for LLMs

If you’re running your own LLM infrastructure, add these specific monitors:

# VRAM usage endpoint (add to your model server)
GET /metrics {"vram_used_gb": 38.2, "vram_total_gb": 80.0}

Set a keyword monitor: alert if vram_used_gb exceeds 90% of total. This catches OOM crashes before they happen.

For more on production LLM monitoring, read our LLM alerting in production guide.

Free vs Pro Plan

FeatureFreePro ($7/mo)
Monitors50100+
Check interval5 min30 sec
Log history2 months12+ months
Status pages1Unlimited
SMS alerts

The free plan is genuinely useful. Upgrade to Pro only when you need faster detection (30-second checks) or more monitors.

FAQ

Is 5-minute monitoring good enough for AI apps?

For most apps, yes. A 5-minute gap means worst-case, your users experience 5 minutes of downtime before you’re alerted. For mission-critical production apps, upgrade to Pro for 30-second checks. For side projects and demos, 5 minutes is fine.

Can UptimeRobot monitor websocket connections?

Not directly. For websocket-based AI chat apps, monitor the HTTP health endpoint instead. If your server is healthy enough to respond to HTTP, the websocket server is likely fine too. For deep websocket monitoring, you’d need a custom solution.

How do I avoid false alerts from slow AI responses?

Set your timeout higher than your model’s typical response time. If your AI endpoint takes 5-8 seconds normally, set the monitor timeout to 30 seconds. Also use “alert after 2 consecutive failures” to avoid one-off timeout alerts.

Can I monitor self-hosted Ollama instances?

Yes, as long as the Ollama API is accessible from the internet (or you use UptimeRobot’s Pro plan with private monitoring). Point a monitor at http://your-server:11434/api/tags — it should return a JSON list of models. If it doesn’t respond, your Ollama server is down.