🤖 AI Tools
· 3 min read

How to Deploy an AI App on Railway — Step-by-Step Guide


Railway is one of the simplest platforms for deploying AI applications. Push your code, set environment variables, get a URL. No Dockerfiles, no Kubernetes, no infrastructure management.

Here’s how to deploy an AI-powered FastAPI app from zero to production.

What you’ll deploy

A FastAPI app that calls an LLM API (Claude, GPT, or DeepSeek) and returns responses. This pattern covers chatbots, summarizers, code reviewers, and most AI features.

Step 1: Create the app

# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import httpx
import os

app = FastAPI()

class Query(BaseModel):
    prompt: str
    max_tokens: int = 500

@app.post("/chat")
async def chat(query: Query):
    api_key = os.environ.get("ANTHROPIC_API_KEY")
    if not api_key:
        raise HTTPException(500, "API key not configured")
    
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.anthropic.com/v1/messages",
            headers={
                "x-api-key": api_key,
                "anthropic-version": "2023-06-01",
                "content-type": "application/json",
            },
            json={
                "model": "claude-sonnet-4-5-20250514",
                "max_tokens": query.max_tokens,
                "messages": [{"role": "user", "content": query.prompt}],
            },
            timeout=30.0,
        )
    
    if response.status_code != 200:
        raise HTTPException(response.status_code, "LLM API error")
    
    data = response.json()
    return {"response": data["content"][0]["text"]}

@app.get("/health")
async def health():
    return {"status": "ok"}
# requirements.txt
fastapi
uvicorn[standard]
httpx
pydantic

Step 2: Deploy on Railway

  1. Push your code to GitHub
  2. Go to railway.app and sign in with GitHub
  3. Click “New Project” > “Deploy from GitHub repo”
  4. Select your repository
  5. Railway auto-detects Python and deploys

Step 3: Set environment variables

In the Railway dashboard, go to your service > Variables:

ANTHROPIC_API_KEY=sk-ant-...
PORT=8000

Railway automatically sets PORT but you can override it. Add your LLM API key here, never in code.

Step 4: Configure the start command

Railway usually auto-detects this, but if needed, set it in a Procfile:

web: uvicorn main:app --host 0.0.0.0 --port $PORT

Step 5: Add a custom domain

In Railway dashboard > Settings > Networking > Custom Domain. Point your DNS CNAME to the Railway-provided domain.

Step 6: Test it

curl -X POST https://your-app.railway.app/chat \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Explain Docker in one sentence"}'

Cost

Railway’s Pro plan is $5/month + usage. A typical AI app serving 1,000 requests/day costs $5-15/month on Railway (excluding LLM API costs). See our cost optimization guide for managing the LLM spend.

Common issues and fixes

”No start command found”

Railway can’t detect how to start your app. Add a Procfile:

web: uvicorn main:app --host 0.0.0.0 --port $PORT

“Port already in use”

Always use $PORT from the environment, never hardcode:

import os
port = int(os.environ.get("PORT", 8000))

“Build failed: pip install error”

Pin your Python version with a runtime.txt:

python-3.11.9

“Request timeout”

LLM API calls can be slow. Increase your timeout and add streaming:

@app.post("/chat/stream")
async def chat_stream(query: Query):
    async def generate():
        async with httpx.AsyncClient() as client:
            async with client.stream("POST", 
                "https://api.anthropic.com/v1/messages",
                headers=headers,
                json={**payload, "stream": True},
                timeout=60.0,
            ) as response:
                async for chunk in response.aiter_text():
                    yield chunk
    
    return StreamingResponse(generate(), media_type="text/event-stream")

Adding a database

Railway makes adding Postgres trivial:

  1. In your project, click “New” > “Database” > “PostgreSQL”
  2. Railway auto-creates DATABASE_URL environment variable
  3. Use it in your app:
import os
DATABASE_URL = os.environ.get("DATABASE_URL")

This is useful for storing conversation history, user preferences, or caching LLM responses to reduce API costs.

Scaling

Railway auto-scales based on traffic. For AI apps, the bottleneck is usually the LLM API, not your server. But if you need more control:

  • Horizontal scaling — Railway supports multiple replicas
  • Region selection — deploy closer to your users or your LLM API provider
  • Resource limits — set memory and CPU limits to control costs

Adding production features

Once deployed, add these incrementally:

  • Rate limiting — use slowapi middleware to prevent abuse
  • Logging — log every LLM call with tokens, latency, cost
  • Caching — cache identical prompts to reduce API calls
  • Authentication — add API key auth for your endpoints
  • Monitoring — connect Helicone as a proxy for automatic LLM observability

Alternatives

PlatformBest forPricing
RailwaySimplest deploy, good for AI apps$5/mo + usage
VercelFrontend + serverless functionsFree tier available
RenderSimilar to Railway, free tierFree tier available
CloudwaysManaged cloud hosting (AWS/GCP/DO)From $14/mo
Self-hostedFull control, cheapest at scaleVPS cost only

For the full deployment checklist, see our AI app deployment checklist.

Related: AI App Deployment Checklist · How to Reduce LLM API Costs · Self-Hosted AI for Enterprise · LLM Observability