Grok Build is the first terminal coding agent with a true multi-agent architecture. Instead of processing your request with a single model call, it decomposes complex tasks into subtasks and runs parallel subagents that work simultaneously. This is the feature that separates it from Claude Code, Codex CLI, and every other CLI agent on the market.
Here’s how it works, when it helps, and how to configure it for your projects.
For the full tool overview, see our Grok Build complete guide. For a comparison of how this architecture stacks up against other approaches, see our Antigravity SDK custom agents guide.
How Multi-Agent Execution Works
When you give Grok Build a complex task, the orchestrator agent follows this process:
- Task analysis. The orchestrator reads your prompt and the relevant codebase context.
- Decomposition. It breaks the task into independent subtasks that can run in parallel.
- Subagent spawning. Each subtask gets its own subagent with a focused context window.
- Parallel execution. Subagents work simultaneously, each producing file changes.
- Merge and conflict resolution. The orchestrator collects results, merges changes, and resolves conflicts.
- Application. Changes are applied (Code Mode) or presented for review (Plan Mode).
The key insight: subagents don’t share context with each other during execution. Each one gets a slice of the codebase relevant to its subtask, plus the overall project instructions from your CLAUDE.md or configuration files.
Seeing Subagents in Action
grok build "Add user authentication with JWT, including login/register endpoints, middleware, and tests"
Terminal output:
🔄 Analyzing task...
📋 Decomposed into 4 subtasks:
[1/4] 🔧 Creating auth utilities (src/utils/jwt.ts, src/utils/password.ts)
[2/4] 🔧 Building endpoints (src/routes/auth.ts)
[3/4] 🔧 Adding middleware (src/middleware/authenticate.ts)
[4/4] 🔧 Writing tests (tests/auth.test.ts)
⚡ Running 4 subagents in parallel...
✅ [1/4] Auth utilities complete (2.1s)
✅ [3/4] Middleware complete (2.4s)
✅ [2/4] Endpoints complete (3.1s)
✅ [4/4] Tests complete (3.8s)
🔀 Merging results...
✅ All changes applied. 4 files created, 1 file modified.
Total time: 4.2s (vs ~12s sequential estimate)
The wall-clock time is determined by the slowest subagent, not the sum of all subtasks. In this example, what would take roughly 12 seconds sequentially completes in 4.2 seconds.
When Subagents Help (and When They Don’t)
High-value scenarios
Multi-file feature implementation. Adding a feature that spans routes, services, models, and tests. Each layer can be generated independently.
grok build "Add a /comments endpoint with CRUD operations, validation, pagination, and integration tests"
Parallel refactoring. Applying the same pattern change across multiple independent modules.
grok build "Convert all callback-based error handling to async/await in src/services/"
Documentation and code together. Generating implementation alongside docs, tests, and examples.
grok build "Add a webhook system: implementation, API docs, usage examples, and unit tests"
Codebase-wide updates. Updating imports, renaming patterns, or applying linting fixes across many files.
grok build "Update all files to use the new logger import path from @app/logger instead of ../utils/logger"
Low-value scenarios
Single-file debugging. If the task is “fix the bug on line 42 of auth.ts,” there’s nothing to parallelize. The orchestrator recognizes this and runs a single agent.
Sequential logic. Tasks where step B depends on the output of step A can’t be parallelized. “First create the database schema, then build the API that uses it” will run sequentially regardless.
Small tasks. The orchestrator overhead (analyzing, decomposing, merging) adds latency. For tasks that take less than 3 seconds with a single agent, subagents are slower.
Configuring Subagent Behavior
Max parallel agents
Control how many subagents can run simultaneously:
# Set max parallel subagents (default: 4)
grok config set max-subagents 6
# Or per-session
grok build --max-subagents 2 "Refactor the API layer"
Higher values speed up large tasks but consume more API tokens in parallel. If you’re on a rate-limited API plan, keep this low.
Subagent context allocation
Each subagent gets a portion of the 256K context window. The orchestrator decides how to allocate based on subtask complexity:
# View how context was allocated after a task
/cost
# Output includes:
# Orchestrator: 12K tokens
# Subagent 1 (auth utils): 34K tokens
# Subagent 2 (endpoints): 48K tokens
# Subagent 3 (middleware): 28K tokens
# Subagent 4 (tests): 41K tokens
# Total: 163K / 256K available
Model routing per subagent
This is where Grok Build’s model flexibility combines with multi-agent architecture. You can route different subagents to different models:
# In your project's .grok/config.yaml
subagents:
default_model: grok-3
overrides:
tests:
model: grok-3-mini # Cheaper model for test generation
documentation:
model: grok-3-mini # Docs don't need the frontier model
refactoring:
model: grok-3 # Use the best model for complex refactors
This lets you optimize cost without sacrificing quality where it matters. Test generation and documentation are typically simpler tasks that work fine with smaller models, while core logic changes benefit from the most capable model.
Conflict Resolution
When multiple subagents modify the same file, the orchestrator handles merging. Three scenarios:
1. Non-overlapping changes (auto-merged)
Subagent A adds a function at line 10. Subagent B adds a function at line 50. These merge cleanly without intervention.
2. Adjacent changes (smart merge)
Subagent A modifies the imports section. Subagent B also adds an import. The orchestrator combines both import additions intelligently.
3. Conflicting changes (requires resolution)
Subagent A rewrites a function one way. Subagent B rewrites the same function differently. The orchestrator flags this:
⚠️ Conflict in src/services/user.ts (lines 24-38)
Subagent 1 (validation) and Subagent 3 (error handling) both modified getUserById()
Option A (Subagent 1):
[diff showing validation approach]
Option B (Subagent 3):
[diff showing error handling approach]
Option C (merged):
[diff showing orchestrator's best merge attempt]
Choose: [a] [b] [c] [e]dit manually
In Plan Mode, conflicts are always shown for your review. In Code Mode, the orchestrator picks Option C (its best merge) automatically. If you want to catch conflicts, use Plan Mode for complex multi-agent tasks.
Hooks and Subagent Lifecycle
Grok Build’s hooks system fires events at each stage of the multi-agent pipeline:
# .grok/hooks.yaml
hooks:
on_decompose:
- script: ./scripts/log-subtasks.sh
# Fires after task decomposition, before subagents start
on_subagent_start:
- script: ./scripts/notify-start.sh
# Fires when each subagent begins work
on_subagent_complete:
- script: ./scripts/validate-output.sh
# Fires when each subagent finishes
# Can reject subagent output and trigger retry
on_merge:
- script: ./scripts/run-linter.sh
# Fires after all subagents complete and changes are merged
# Runs before changes are applied to disk
on_conflict:
- script: ./scripts/alert-team.sh
# Fires when subagent outputs conflict
The on_subagent_complete hook is particularly powerful. You can run validation (linting, type checking) on each subagent’s output before it gets merged. If validation fails, the hook can reject the output and the orchestrator will retry that subagent.
ACP Integration with Subagents
Grok Build’s Agent Client Protocol (ACP) support means external tools can participate in the multi-agent pipeline as subagents:
# .grok/config.yaml
acp_agents:
- name: security-scanner
endpoint: http://localhost:8080/acp
trigger: on_merge
# Runs a security scan on merged output before applying
- name: style-enforcer
endpoint: https://my-team-tools.internal/style-check
trigger: on_subagent_complete
# Checks each subagent's output against team style guide
This lets you integrate custom tooling (security scanners, style checkers, compliance validators) directly into the agent pipeline without modifying Grok Build itself.
Inspecting Subagent Decisions
Use grok inspect to understand how the orchestrator decomposed a previous task:
# View the last task's decomposition
grok inspect --last
# Output:
# Task: "Add user authentication with JWT..."
# Decomposition strategy: feature-layer
# Subtasks: 4
# 1. Auth utilities (independent, no dependencies)
# 2. Endpoints (depends on: utilities)
# 3. Middleware (depends on: utilities)
# 4. Tests (depends on: endpoints, middleware)
#
# Execution order:
# Phase 1 (parallel): [1]
# Phase 2 (parallel): [2, 3]
# Phase 3 (parallel): [4]
#
# Note: Subtask 4 waited for 2 and 3 because tests import from both.
This reveals that the orchestrator doesn’t always run everything in parallel. It builds a dependency graph and parallelizes within each dependency level. Tasks with dependencies run in phases.
Comparison with Other Multi-Agent Approaches
| Approach | Tool | How it works |
|---|---|---|
| Parallel subagents | Grok Build | Orchestrator decomposes, subagents run in parallel |
| SDK-based agents | Antigravity SDK | You define agents in code, orchestrate manually |
| Sequential routines | Claude Code | Predefined steps run one after another |
| Single agent | Codex CLI | One model call handles everything |
Grok Build’s approach is the most automated. You don’t define the decomposition strategy; the orchestrator figures it out. This is convenient but means you have less control over how work is split.
Antigravity’s SDK approach gives you full control over agent definitions and orchestration, but requires writing code. See our Antigravity SDK custom agents guide for that workflow.
Performance Benchmarks
Based on our testing with a medium-sized TypeScript project (150 files, 25K lines):
| Task | Single agent | Grok Build (4 subagents) | Speedup |
|---|---|---|---|
| Add CRUD endpoint + tests | 11.2s | 4.8s | 2.3x |
| Refactor 8 service files | 24.1s | 8.3s | 2.9x |
| Add validation across 12 routes | 31.5s | 9.1s | 3.5x |
| Fix single bug | 3.2s | 3.8s | 0.8x (slower) |
| Update README | 2.1s | 2.6s | 0.8x (slower) |
The pattern is clear: multi-agent shines for tasks that naturally decompose into 3+ independent subtasks. For focused, single-file work, the orchestration overhead makes it slightly slower.
Best Practices
-
Let the orchestrator decide. Don’t try to manually decompose tasks in your prompt. “Add auth with login, register, middleware, and tests” works better than four separate prompts.
-
Use Plan Mode for multi-agent tasks. The merge step can produce unexpected results. Review before applying.
-
Set model routing for cost control. Not every subagent needs the frontier model. Route simple tasks to cheaper models.
-
Monitor with /cost. Multi-agent tasks consume more total tokens (each subagent gets its own context). Track spending.
-
Keep max-subagents reasonable. More than 6 parallel agents rarely helps and can hit rate limits. The default of 4 is a good balance.
-
Use hooks for validation. Run linters and type checkers via
on_subagent_completehooks to catch issues before merge.
FAQ
How many subagents can run in parallel?
The default maximum is 4. You can increase this to 8 with grok config set max-subagents 8. Higher values are possible but may hit API rate limits depending on your plan.
Do subagents share context with each other?
No. Each subagent gets an independent context window with only the files relevant to its subtask, plus shared project configuration (CLAUDE.md, etc.). They don’t see each other’s work until the merge phase.
Can I force a task to run without subagents?
Yes. Use --max-subagents 1 to disable decomposition and run everything as a single agent. Useful for tasks where you know parallelism won’t help.
How does the orchestrator decide what to parallelize?
It analyzes file dependencies, import graphs, and the semantic structure of your request. Tasks that touch independent files or modules get parallelized. Tasks with sequential dependencies run in phases.
Do subagents cost more than a single agent?
Yes, in total tokens. Each subagent gets its own context (including shared project files), so there’s duplication. A 4-subagent task typically uses 1.5x to 2.5x the tokens of a single-agent approach. The tradeoff is wall-clock time: you pay more tokens but finish faster.
What happens if a subagent fails?
The orchestrator retries the failed subagent up to 2 times. If it still fails, the orchestrator completes the task with the successful subagents’ output and reports which subtask failed. You can then address the failed portion manually or re-run it.
Can I define custom decomposition strategies?
Not in the current beta. The orchestrator handles decomposition automatically. xAI has indicated that custom decomposition rules are on the roadmap for a future release.
Does multi-agent work in headless mode?
Yes. Headless mode (-p flag) with subagents works the same way. The streaming-json output includes per-subagent progress events, making it suitable for CI/CD dashboards that want to show parallel task progress.