Grok Build is xAI’s first dedicated coding agent for the terminal. Launched May 14, 2026 in early beta, it brings multi-agent architecture, a Skills/Plugins marketplace, and native CLAUDE.md support to a CLI that feels immediately familiar if you’ve used Claude Code or Codex CLI.
The pitch: a coding agent that can route to any model, run parallel subagents for complex tasks, and integrate with third-party tools through the Agent Client Protocol (ACP). It’s opinionated about architecture but flexible about which LLM does the work.
Here’s everything you need to know about Grok Build: what it does, how to set it up, and whether it’s worth switching to.
What is Grok Build?
Grok Build is a terminal-based AI coding agent. You describe what you want, and it reads your codebase, plans changes, edits files, and runs commands. It’s similar in concept to Claude Code, Codex CLI, and Gemini CLI, but with a few differentiators:
- Multi-agent architecture: Complex tasks get split across parallel subagents
- Custom model routing: Use Grok models, or route through OpenRouter to any model
- Skills/Plugins marketplace: Extend the agent with community-built capabilities
- CLAUDE.md native support: Drop-in migration from Claude Code projects
- 256K context window: Large enough for most codebases without aggressive compaction
It’s currently in early beta, so expect rough edges. But the core experience is solid.
Install
curl -fsSL https://x.ai/cli/install.sh | bash
This installs the grok binary to your PATH. Verify with:
grok --version
Requirements: macOS or Linux. Windows support is not available yet.
Authentication
Two options:
Browser OAuth (recommended for personal use)
grok auth login
This opens your browser and authenticates through your xAI/X account. Requires an active SuperGrok subscription ($99/month).
API Key (for CI/CD and automation)
export XAI_API_KEY=xai-...
You can get an API key from the xAI console. Pricing is usage-based: $1 per 1M input tokens through OpenRouter, or xAI’s native pricing for Grok models.
Modes
Grok Build has three operating modes that control how much autonomy the agent has:
| Mode | Behavior | Best for |
|---|---|---|
code (default) | Reads, edits, and runs commands automatically | Daily development |
plan | Shows diffs before applying, requires approval | Reviewing changes, learning |
ask | Read-only, no file modifications | Questions, exploration |
Switch modes during a session:
# Start in plan mode
grok --mode plan
# Or switch mid-session
/plan # Switch to plan mode
/model # Change the active model
Plan Mode is the standout here. It generates a full diff of proposed changes and waits for your approval before touching anything. This is similar to Codex CLI’s --suggest mode but with better diff visualization.
Commands
Inside a Grok Build session:
| Command | Action |
|---|---|
/plan | Switch to Plan Mode |
/model | Change the active model |
/compact | Compress context to free up token space |
/clear | Reset the conversation |
/cost | Show token usage and cost for the session |
From outside a session:
# Inspect project structure and agent config
grok inspect
# Run in headless mode (for scripts/CI)
grok -p "Fix all TypeScript errors" --output-format streaming-json
Multi-Agent Architecture
This is Grok Build’s main differentiator. When you give it a complex task, it doesn’t process everything sequentially. Instead:
- A coordinator agent analyzes the task and breaks it into subtasks
- Parallel subagents spin up to handle each subtask simultaneously
- Results are merged and verified by the coordinator
In practice, this means a task like “add authentication to this Express app” might spawn subagents for:
- Writing the auth middleware
- Creating the user model
- Adding login/register routes
- Writing tests
Each subagent works independently, then the coordinator reconciles conflicts and ensures everything integrates correctly.
You can observe this in action. The CLI shows which subagents are active and what they’re working on. It’s not always faster than sequential processing (coordination has overhead), but for large multi-file changes it can cut completion time significantly.
Skills, Plugins, and the Marketplace
Grok Build has an extensibility system called Skills. Skills are packaged capabilities that extend what the agent can do:
- Built-in skills: File editing, terminal commands, git operations
- Community skills: Available through the Skills Marketplace
- Custom skills: Write your own using the Skills SDK
Think of it like MCP servers but with a distribution layer. You can browse and install skills directly from the CLI:
grok skills search "docker"
grok skills install @xai/docker-compose
The marketplace is still sparse (it’s early beta), but the architecture is promising. Skills can hook into the agent lifecycle through the Hooks system.
Hooks (Lifecycle Events)
Hooks let you run custom logic at specific points in the agent’s workflow:
pre-edit: Before any file modificationpost-edit: After file changes are appliedpre-command: Before running a shell commandpost-command: After command executionon-error: When something fails
Configure hooks in your project’s .grok/hooks.json:
{
"post-edit": "npm run lint --fix",
"pre-command": "echo 'Running: ${command}'"
}
This is useful for enforcing project standards automatically: run linting after every edit, format code, or trigger tests.
CLAUDE.md Compatibility
If you’re migrating from Claude Code, Grok Build reads your existing CLAUDE.md file natively. No changes needed. It parses the same project context, conventions, and instructions you’ve already defined.
This is a smart move by xAI. The CLAUDE.md format has become a de facto standard for project-level AI instructions (similar to how AGENTS.md works for Codex). By supporting it out of the box, they eliminate the biggest friction point for switching.
Custom Model Routing
Grok Build isn’t locked to Grok models. You can route requests through OpenRouter to use any supported model:
# Use Claude Sonnet through Grok Build
/model claude-sonnet-4
# Use GPT-5
/model gpt-5
# Use a local model via OpenRouter
/model local/llama-4
This makes Grok Build more of a universal coding agent interface than a Grok-specific tool. You get the multi-agent architecture, Skills system, and tooling regardless of which model does the actual reasoning.
MCP Server Support
Grok Build can connect to MCP (Model Context Protocol) servers, giving it access to external tools and data sources:
{
"mcpServers": {
"github": {
"command": "npx",
"args": ["@modelcontextprotocol/server-github"]
}
}
}
This works the same way MCP servers work in Claude Code or other MCP-compatible tools. If you already have MCP servers configured, they’ll work with Grok Build.
Headless Mode
For CI/CD pipelines and automation, Grok Build supports headless execution:
# Single prompt, no interactive session
grok -p "Add error handling to all API routes" --output-format streaming-json
The streaming-json output format gives you structured data about what the agent did: files modified, commands run, and results. This makes it easy to integrate into scripts, GitHub Actions, or custom tooling.
ACP (Agent Client Protocol)
ACP is xAI’s protocol for third-party integration. It lets external tools communicate with Grok Build as a service:
- IDE extensions can use Grok Build as their backend
- Custom UIs can drive the agent programmatically
- Other agents can delegate tasks to Grok Build
This is still early, but it positions Grok Build as infrastructure rather than just a CLI tool.
Pricing
Two paths:
| Option | Cost | Best for |
|---|---|---|
| SuperGrok subscription | $99/month | Individual developers, unlimited use |
| API key (usage-based) | ~$1/1M input tokens | CI/CD, occasional use, teams |
The $99/month SuperGrok subscription includes unlimited Grok Build usage plus access to all xAI products (Grok chat, image generation, etc.). If you’re already paying for SuperGrok, Grok Build is included at no extra cost.
For comparison: Claude Code requires a $100/month Max plan for heavy use, Codex CLI uses OpenAI API credits, and Antigravity 2.0 starts at $20/month with compute-based limits.
Arena Mode (Coming Soon)
xAI has announced Arena Mode but hasn’t shipped it yet. The concept: multiple agents compete on the same task, and you pick the best result. Think of it like LLM Arena but for code generation: you see two or more solutions side by side and choose the winner.
This could be interesting for complex architectural decisions where you want to see different approaches. No timeline on when it ships.
Multimodal Input
Grok Build accepts both text and image input. You can paste screenshots, diagrams, or mockups directly into the CLI and ask the agent to implement what it sees. This works for:
- Implementing UI from design mockups
- Debugging from error screenshots
- Understanding architecture diagrams
How It Compares
A quick positioning against the other terminal coding agents:
| Feature | Grok Build | Claude Code | Codex CLI | Antigravity CLI |
|---|---|---|---|---|
| Multi-agent | Yes (parallel) | No | No | Yes (parallel) |
| Model flexibility | Any via OpenRouter | Claude only | OpenAI only | Gemini only |
| Context window | 256K | 200K | 200K | 1M+ |
| Sandbox | No | No | OS-level | Cloud sandbox |
| Skills/Plugins | Marketplace | MCP only | MCP only | SDK |
| Price | $99/mo or usage | $100/mo or usage | Usage-based | $20-200/mo |
Grok Build’s main advantages are model flexibility and the Skills marketplace. Its main disadvantage is maturity, since it’s a week old in early beta. Claude Code and Codex CLI have months of production hardening.
FAQ
Is Grok Build free?
No. You need either a SuperGrok subscription ($99/month) or an xAI API key with credits. There’s no free tier, though the API pricing ($1/1M input tokens) is competitive for light usage.
Can I use Grok Build with models other than Grok?
Yes. Grok Build supports custom model routing through OpenRouter. You can use Claude, GPT-5, Gemini, or any other model available on OpenRouter while still getting Grok Build’s multi-agent architecture and tooling.
Does Grok Build work with my existing CLAUDE.md file?
Yes. Grok Build reads CLAUDE.md files natively. If you’re migrating from Claude Code, your project instructions carry over without any changes.
How does Grok Build compare to Claude Code?
Both are terminal coding agents, but Grok Build offers multi-agent parallel execution, model flexibility, and a Skills marketplace. Claude Code has better maturity, a larger community, and tighter integration with Anthropic’s models. See our detailed comparison for the full breakdown.
Is Grok Build stable enough for production use?
It’s early beta (launched May 14, 2026). Core functionality works well, but expect occasional bugs, missing features, and breaking changes. Use Plan Mode for important work so you can review changes before they’re applied.
What’s the difference between Skills and MCP servers?
Skills are Grok Build’s native extension system with a marketplace for discovery and installation. MCP servers are a cross-tool protocol that Grok Build also supports. Skills can do everything MCP servers can, plus hook into Grok Build’s lifecycle events. Use MCP if you need cross-tool compatibility; use Skills for deeper Grok Build integration.
Does Grok Build support Windows?
Not yet. It currently runs on macOS and Linux only. Windows support hasn’t been announced.