🤖 AI Tools
· 3 min read

AI App Deployment Checklist — From Localhost to Production


Your AI app works on localhost. Now you need to ship it. AI deployments have more failure modes than traditional software: costs scale with usage, model behavior changes unpredictably, and quality degrades silently.

This checklist catches the issues before your users do.

Pre-deployment

API & Model Configuration

  • API keys stored in environment variables, not in code
  • API keys scoped to minimum required permissions
  • Model version pinned (not “latest”) to prevent surprise behavior changes
  • Fallback model configured for when primary is down (e.g., DeepSeek as fallback for Claude)
  • max_tokens set on every request to prevent runaway generation
  • Temperature set explicitly (don’t rely on defaults)

Cost Protection

  • Monthly spending limit set with provider (monitor guide)
  • Per-request cost estimate logged (what to log)
  • Alert at 50%, 75%, 90% of budget (FinOps guide)
  • Rate limiting on user-facing endpoints (prevent abuse)
  • Token budget per user/session

Security

  • Prompt injection defenses in place
  • System prompt not leakable via user input
  • User input sanitized before passing to model
  • MCP tools have least-privilege access
  • No PII in logs (or redacted before logging)
  • Red team testing completed

Quality

  • Eval dataset of 20+ test cases created
  • Baseline scores recorded for current prompt version
  • Structured output validation on model responses
  • Error handling for malformed model responses
  • Timeout handling (what happens when the API is slow?)

Deployment

Infrastructure

  • Hosting configured (Railway, Vercel, or self-hosted)
  • Environment variables set in production
  • HTTPS enabled
  • CORS configured correctly
  • Health check endpoint responding

Monitoring

Rollback Plan

  • Previous version tagged in git
  • One-command rollback procedure documented
  • Database migrations are reversible (if applicable)
  • Feature flags for AI features (can disable without redeploy)

Post-deployment

First 24 hours

  • Monitor error rates (should be <1%)
  • Check cost dashboard (is spend within expected range?)
  • Review sample of real user interactions
  • Verify logging is capturing all fields
  • Check latency (P50, P95, P99)

First week

  • Run regression tests against production
  • Review user feedback
  • Check for prompt injection attempts in logs
  • Verify cost projections match actual spend
  • Document any issues for next deployment

The minimum viable checklist

If you’re moving fast and can only do 5 things:

  1. Pin your model version — prevents surprise behavior changes
  2. Set a spending limit — prevents bill shock
  3. Log every LLM call — you’ll need this for debugging
  4. Add rate limiting — prevents abuse
  5. Have a rollback plan — one command to go back

Everything else can be added incrementally. See our governance guide for the full production framework.

Related: LLM Observability · What to Log in AI Systems · AI Security Checklist · How to Reduce LLM API Costs · Evaluate Ai Vendors Enterprise