πŸ€– AI Tools
Β· 2 min read

How to Monitor and Control AI API Spending β€” Stop the Surprise Bills


AI API costs can spiral from $50 to $5,000 in a weekend if an agent loops or traffic spikes. Here’s how to prevent surprise bills.

1. Set hard spending limits

Every provider offers spending caps. Set them BEFORE you start:

ProviderHow to set limit
OpenAISettings β†’ Billing β†’ Usage limits β†’ Set hard cap
AnthropicConsole β†’ Plans β†’ Set monthly spend limit
OpenRouterSettings β†’ Credit limit (auto-stops at $0)
Google AICloud Console β†’ Budgets & alerts
DeepSeekTop up fixed amount, no auto-recharge

Rule: Set your hard cap at 2x your expected monthly spend. If you expect $50/month, cap at $100.

2. Set up alerts

Don’t wait for the bill. Get notified at 50%, 75%, and 90% of your budget:

# Simple spending tracker
import json
from datetime import datetime

DAILY_BUDGET = 10.00  # dollars

def track_spend(tokens_used, model, cost_per_1m):
    cost = (tokens_used / 1_000_000) * cost_per_1m
    
    today = datetime.now().strftime("%Y-%m-%d")
    ledger = json.load(open("spend_ledger.json", "r"))
    ledger.setdefault(today, 0)
    ledger[today] += cost
    json.dump(ledger, open("spend_ledger.json", "w"))
    
    if ledger[today] > DAILY_BUDGET * 0.75:
        send_alert(f"⚠️ AI spend at ${ledger[today]:.2f} today (75% of budget)")
    if ledger[today] > DAILY_BUDGET:
        send_alert(f"🚨 AI spend OVER BUDGET: ${ledger[today]:.2f}")
        raise Exception("Daily budget exceeded")

For our AI race, we added OpenRouter budget detection to the orchestrator β€” it sends a Discord alert when agents run out of credits.

3. Use prepaid credits (not auto-billing)

The safest approach: buy a fixed amount of credits and don’t enable auto-recharge.

  • OpenRouter: Buy $25 credits. When they’re gone, API stops. No surprise bills.
  • DeepSeek: Same β€” top up a fixed amount.
  • OpenAI/Anthropic: These auto-charge by default. Disable auto-recharge and set hard caps.

4. Log everything

Track per-request costs so you know where money goes:

import logging

logger = logging.getLogger("ai_costs")

def log_request(model, input_tokens, output_tokens, cost):
    logger.info(f"model={model} in={input_tokens} out={output_tokens} cost=${cost:.4f}")

After a week, you’ll know exactly which features/endpoints consume the most tokens.

5. Circuit breakers for agents

AI agents can loop β€” retrying the same failed task hundreds of times. Add circuit breakers:

MAX_RETRIES = 3
MAX_TOKENS_PER_SESSION = 500_000

if retry_count > MAX_RETRIES:
    log("Agent stuck in loop, stopping")
    break

if session_tokens > MAX_TOKENS_PER_SESSION:
    log("Token budget exceeded for this session")
    break

Our race orchestrator has a 3-consecutive-failure guard that stops agents from burning budget on repeated failures.

The monitoring stack

NeedFree optionPaid option
Spending alertsCustom script + Discord/SlackHelicone, LangSmith
Usage dashboardProvider dashboardsHelicone ($20/mo)
Per-request loggingCustom middlewareLangSmith, Portkey
Budget enforcementHard caps + prepaidPortkey, custom proxy

For most teams, provider dashboards + hard caps + a simple logging script is enough. You don’t need a $200/month observability platform to track $50/month in API costs.

Related: How to Reduce LLM API Costs Β· LLM Cost Calculator Guide Β· Prompt Caching Explained Β· OpenRouter Complete Guide