Apr 14, 2026 · 4 min read

How to Deploy an AI App on Railway — Step-by-Step Guide

Some links in this article are affiliate links. We earn a commission at no extra cost to you when you purchase through them. Full disclosure.

Railway is one of the simplest platforms for deploying AI applications. Push your code, set environment variables, get a URL. No Dockerfiles, no Kubernetes, no infrastructure management.

Here’s how to deploy an AI-powered FastAPI app from zero to production.

What you’ll deploy

A FastAPI app that calls an LLM API (Claude, GPT, or DeepSeek) and returns responses. This pattern covers chatbots, summarizers, code reviewers, and most AI features.

Step 1: Create the app

# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import httpx
import os

app = FastAPI()

class Query(BaseModel):
    prompt: str
    max_tokens: int = 500

@app.post("/chat")
async def chat(query: Query):
    api_key = os.environ.get("ANTHROPIC_API_KEY")
    if not api_key:
        raise HTTPException(500, "API key not configured")
    
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.anthropic.com/v1/messages",
            headers={
                "x-api-key": api_key,
                "anthropic-version": "2023-06-01",
                "content-type": "application/json",
            },
            json={
                "model": "claude-sonnet-4-5-20250514",
                "max_tokens": query.max_tokens,
                "messages": [{"role": "user", "content": query.prompt}],
            },
            timeout=30.0,
        )
    
    if response.status_code != 200:
        raise HTTPException(response.status_code, "LLM API error")
    
    data = response.json()
    return {"response": data["content"][0]["text"]}

@app.get("/health")
async def health():
    return {"status": "ok"}

# requirements.txt
fastapi
uvicorn[standard]
httpx
pydantic

Step 2: Deploy on Railway

Push your code to GitHub
Go to railway.app and sign in with GitHub
Click “New Project” > “Deploy from GitHub repo”
Select your repository
Railway auto-detects Python and deploys

Step 3: Set environment variables

In the Railway dashboard, go to your service > Variables:

ANTHROPIC_API_KEY=sk-ant-...
PORT=8000

Railway automatically sets PORT but you can override it. Add your LLM API key here, never in code.

Step 4: Configure the start command

Railway usually auto-detects this, but if needed, set it in a Procfile:

web: uvicorn main:app --host 0.0.0.0 --port $PORT

Step 5: Add a custom domain

In Railway dashboard > Settings > Networking > Custom Domain. Point your DNS CNAME to the Railway-provided domain.

Step 6: Test it

curl -X POST https://your-app.railway.app/chat \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Explain Docker in one sentence"}'

Cost

Railway’s Pro plan is $5/month + usage. A typical AI app serving 1,000 requests/day costs $5-15/month on Railway (excluding LLM API costs). See our cost optimization guide for managing the LLM spend.

Common issues and fixes

”No start command found”

Railway can’t detect how to start your app. Add a Procfile:

web: uvicorn main:app --host 0.0.0.0 --port $PORT

“Port already in use”

Always use $PORT from the environment, never hardcode:

import os
port = int(os.environ.get("PORT", 8000))

“Build failed: pip install error”

Pin your Python version with a runtime.txt:

python-3.11.9

“Request timeout”

LLM API calls can be slow. Increase your timeout and add streaming:

@app.post("/chat/stream")
async def chat_stream(query: Query):
    async def generate():
        async with httpx.AsyncClient() as client:
            async with client.stream("POST", 
                "https://api.anthropic.com/v1/messages",
                headers=headers,
                json={**payload, "stream": True},
                timeout=60.0,
            ) as response:
                async for chunk in response.aiter_text():
                    yield chunk
    
    return StreamingResponse(generate(), media_type="text/event-stream")

Adding a database

Railway makes adding Postgres trivial:

In your project, click “New” > “Database” > “PostgreSQL”
Railway auto-creates DATABASE_URL environment variable
Use it in your app:

import os
DATABASE_URL = os.environ.get("DATABASE_URL")

This is useful for storing conversation history, user preferences, or caching LLM responses to reduce API costs.

Scaling

Railway auto-scales based on traffic. For AI apps, the bottleneck is usually the LLM API, not your server. But if you need more control:

Horizontal scaling — Railway supports multiple replicas
Region selection — deploy closer to your users or your LLM API provider
Resource limits — set memory and CPU limits to control costs

Adding production features

Once deployed, add these incrementally:

Rate limiting — use slowapi middleware to prevent abuse
Logging — log every LLM call with tokens, latency, cost
Caching — cache identical prompts to reduce API calls
Authentication — add API key auth for your endpoints
Monitoring — connect Helicone as a proxy for automatic LLM observability

Alternatives

Platform	Best for	Pricing
Railway	Simplest deploy, good for AI apps	$5/mo + usage
Vercel	Frontend + serverless functions	Free tier available
Render	Similar to Railway, free tier	Free tier available
Cloudways	Managed cloud hosting (AWS/GCP/DO)	From $14/mo
Self-hosted	Full control, cheapest at scale	VPS cost only

For the full deployment checklist, see our AI app deployment checklist.

🚀 Deploy in minutes: Railway is the fastest way to get an AI app into production — connect your repo, Railway handles builds, deploys, and scaling. Includes Postgres, Redis, and cron jobs. Sign up through our link and get $20 in free credits to start.