GPT-5 Complete Guide: Models, Pricing, Benchmarks, and API Setup (2026)
GPT-5 launched in August 2025 as OpenAIβs biggest leap since GPT-4. GPT-5.4, released March 2026, is the current flagship β the first AI model to exceed human performance on desktop computer tasks (75% on OSWorld vs 72.4% human baseline). It handles up to 1 million tokens of context and can natively control software.
Hereβs everything you need to know about the GPT-5 family.
The GPT-5 model family
| Model | Context | Input price | Output price | Best for |
|---|---|---|---|---|
| GPT-5.4 | 1M tokens | $2.50/1M | $15.00/1M | Complex reasoning, coding, computer use |
| GPT-5.4 Mini | 400K tokens | $0.75/1M | $4.50/1M | Fast tasks, high throughput, cost-sensitive |
| GPT-5.4 Pro | 1M tokens | $30.00/1M | $180.00/1M | Maximum quality, research, hard problems |
| GPT-5 Mini | 128K tokens | $0.25/1M | $2.00/1M | Budget option, simple tasks |
Pricing note: Input tokens above 272K are charged at 2x the standard rate. Cached input tokens get a 75-90% discount.
GPT-5.4 dynamically routes between internal models: a lightweight engine for routine tasks, a deeper reasoning engine for complex queries, and fallbacks to Mini/Nano variants when usage limits are reached.
Benchmarks
| Benchmark | GPT-5.4 | Claude Opus 4.6 | Gemini 3.1 Pro | What it tests |
|---|---|---|---|---|
| OSWorld | 75% (beats humans) | 68% | 71% | Desktop computer use |
| SWE-bench Verified | 69.2% | 72.1% | 65.8% | Real bug fixing |
| GPQA Diamond | 78.4% | 81.2% | 94.3% | PhD-level science |
| ARC-AGI-2 | 52.1% | 48.7% | 77.1% | Novel reasoning |
| GDPval | 83% | 79% | 76% | Professional knowledge |
| AIME 2025 | 86.7% | 82.3% | 88.1% | Competition math |
The takeaway: GPT-5.4 leads on computer use (OSWorld) and professional knowledge (GDPval). Claude leads on coding (SWE-bench). Gemini leads on reasoning (ARC-AGI-2, GPQA). No single model wins everything.
Key features
Computer use
GPT-5.4 can natively control desktop software β clicking buttons, filling forms, navigating menus. This is what the 75% OSWorld score means: given a task like βcreate a pivot table in this spreadsheet,β GPT-5.4 can actually operate the software to do it.
This powers Codex CLI and the OpenAI Agents SDK sandbox execution.
1M token context
The full 1M context window means you can feed entire codebases, long documents, or extensive conversation histories in a single request. Output is capped at 128K tokens.
For comparison: Claude also offers 1M context, Gemini offers 1M+, and GLM-5.1 offers 128K.
Dynamic routing
GPT-5.4 internally routes between model sizes based on query complexity. Simple questions get fast, cheap processing. Complex questions get full reasoning. You pay per token regardless, but response times are faster for simple queries.
API setup
from openai import OpenAI
client = OpenAI(api_key="sk-...")
# GPT-5.4 (standard)
response = client.chat.completions.create(
model="gpt-5.4",
messages=[
{"role": "system", "content": "You are a senior software engineer."},
{"role": "user", "content": "Review this authentication middleware for security issues."},
],
max_tokens=4096,
)
print(response.choices[0].message.content)
Using GPT-5.4 Mini (cheaper, faster)
# 3x cheaper, 2x faster β good for most tasks
response = client.chat.completions.create(
model="gpt-5.4-mini",
messages=[{"role": "user", "content": "Explain dependency injection in 3 sentences."}],
)
Using with OpenRouter (multi-provider fallback)
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="or-...",
)
response = client.chat.completions.create(
model="openai/gpt-5.4",
messages=[{"role": "user", "content": "..."}],
)
OpenRouter gives you automatic fallback to other providers if OpenAI is down.
ChatGPT subscription plans
| Plan | Price | Models included | Best for |
|---|---|---|---|
| Free | $0 | GPT-4o (limited) | Casual use |
| Plus | $20/mo | GPT-5.4, GPT-5.4 Mini | Individual developers |
| Pro | $200/mo | GPT-5.4 Pro, Deep Research | Power users, researchers |
| Team | $25/user/mo | GPT-5.4, admin controls | Small teams |
| Enterprise | ~$60/user/mo | All models, SSO, compliance | Organizations |
GPT-5.4 vs GPT-4o: what changed
| Feature | GPT-4o | GPT-5.4 |
|---|---|---|
| Context window | 128K | 1M |
| Max output | 16K | 128K |
| Computer use | β | β Native |
| OSWorld | ~35% | 75% |
| SWE-bench | ~48% | 69.2% |
| Dynamic routing | β | β |
| Price (input) | $2.50/1M | $2.50/1M |
| Price (output) | $10.00/1M | $15.00/1M |
Output is 50% more expensive, but the capability jump is massive. For most developers, GPT-5.4 replaces GPT-4o entirely.
When to use GPT-5.4 vs alternatives
| Use case | Best model | Why |
|---|---|---|
| General coding | Claude Sonnet | Higher SWE-bench, better at multi-file edits |
| Computer use / automation | GPT-5.4 | Best OSWorld score, native computer control |
| Complex reasoning | Gemini 3.1 Pro | 77.1% ARC-AGI-2 |
| Budget coding | GPT-5.4 Mini | $0.75/1M input, good quality |
| Free coding | GLM-5.1 or Qwen 3.6 | MIT licensed, free API tiers |
| Local/private | DeepSeek R1 or Qwen 3.5 | Runs on your hardware |
| Agent infrastructure | OpenAI Agents SDK + GPT-5.4 | Best sandbox execution |
Cost optimization
GPT-5.4 at $2.50/$15 per 1M tokens adds up fast. Strategies to control costs:
- Use Mini for simple tasks β 3x cheaper, handles 80% of queries well
- Cache system prompts β repeated prefixes get 75-90% discount
- Route by complexity β see our model routing guide
- Use OpenRouter β compare prices across providers
- Set per-user budgets β prevent runaway costs
See our AI API spending guide and FinOps for AI for detailed cost management.
FAQ
How much does GPT-5 cost?
GPT-5.4 costs $2.50 per million input tokens and $15.00 per million output tokens via the API. GPT-5.4 Mini is significantly cheaper at $0.75/$4.50 per million tokens. Through ChatGPT, the Plus plan at $20/month includes GPT-5.4 access.
Is GPT-5 better than Claude?
It depends on the task. GPT-5.4 leads on computer use (75% OSWorld vs Claudeβs 68%) and professional knowledge benchmarks, while Claude Opus 4.6 leads on coding tasks like SWE-bench (72.1% vs 69.2%). Neither model dominates across all categories.
Can I use GPT-5 for free?
The free ChatGPT tier only includes GPT-4o, not GPT-5.4. To access GPT-5.4, you need at least a ChatGPT Plus subscription ($20/month) or use the API with pay-per-token pricing. There is no free API tier for GPT-5 models.
Whatβs the context window of GPT-5?
GPT-5.4 supports up to 1 million tokens of input context with a maximum output of 128K tokens. GPT-5.4 Mini has a 400K token context window, and the older GPT-5 Mini supports 128K tokens. Note that input tokens beyond 272K are charged at double the standard rate.
Related: Claude Code vs Codex CLI vs Gemini CLI Β· OpenAI Agents SDK Guide Β· GPT-5 vs Gemini 2.5 Pro Β· DeepSeek V3 vs GPT-5 Β· AI Coding Tools Pricing Β· OpenRouter Complete Guide Β· Best AI Coding Tools