Kimi CLI Complete Guide β Moonshot's Terminal AI Coding Agent
Kimi CLI is an open-source terminal coding agent from Moonshot AI, powered by the Kimi K2.5 model. It competes directly with Claude Code and Codex CLI but with a unique advantage: Agent Swarm, which coordinates up to 100 parallel sub-agents for faster execution.
Update (April 21, 2026): Kimi CLI now supports Kimi K2.6, which scales Agent Swarm to 300 sub-agents and scores 80.2% on SWE-Bench Verified. Moonshot recommends Kimi Code CLI as the primary agent framework for K2.6.
Installation
npm install -g @anthropic-ai/kimi-cli
Or with Homebrew:
brew install kimi-cli
Authentication
Kimi CLI supports multiple auth methods:
Device auth (recommended)
kimi login --device-auth
This opens a browser window where you log in to your Moonshot account. Credentials are stored locally.
API key
export KIMI_API_KEY="your-api-key"
kimi
Using other providers
Kimi CLI isnβt locked to Kimi models. You can configure it to use Claude, GPT, or any OpenAI-compatible API:
# Use with OpenRouter
export OPENROUTER_API_KEY="your-key"
kimi --provider openrouter --model anthropic/claude-opus-4.6
See the OpenRouter guide for model options.
Core features
Chat mode
kimi
> Fix the authentication bug in src/auth.ts
Kimi reads your codebase, understands the context, and makes edits directly. Every change is shown as a diff before applying.
Plan mode
Plan mode is read-only β Kimi analyzes your codebase without making changes:
kimi --plan
> How should I refactor the database layer to support PostgreSQL?
It uses read-only tools (Glob, Grep, ReadFile) to explore the code, then writes a detailed plan for your approval. You can approve, reject, or request revisions before any code is touched.
Agent Swarm
The killer feature. For parallelizable tasks:
kimi
> Refactor all 30 API route handlers to use the new middleware pattern
Kimi spawns multiple sub-agents that work in parallel, each handling a subset of files. The coordinator merges results and resolves conflicts. This is 4.5x faster than sequential processing on large tasks.
Shell commands
Kimi can execute shell commands directly:
kimi
> Run the test suite and fix any failing tests
It runs npm test, reads the output, identifies failures, fixes the code, and re-runs until tests pass.
Web search
kimi
> Search for the latest React 19 migration guide and apply the changes to our project
Kimi can fetch web pages and documentation to inform its code changes β useful when working with new APIs or frameworks.
Configuration
Kimi CLI stores config in ~/.kimi/:
~/.kimi/
βββ sessions/ # Auth sessions
βββ config.json # Global settings
βββ rules/ # Custom rules
Custom rules
Create project-specific rules in .kimi/rules/:
# .kimi/rules/style.md
- Use TypeScript strict mode
- Prefer functional components
- Always add JSDoc comments to exported functions
- Use Tailwind CSS for styling
Pricing
Kimi CLI itself is free and open-source. You pay for the model API:
| Plan | Price | Includes |
|---|---|---|
| Free tier | $0 | Limited requests/day |
| Kimi Code Lite | ~$3/month | Basic quota |
| Kimi Code Pro | ~$19/month | Higher quota + priority |
| Pay-as-you-go | $0.60/$2.50 per 1M tokens | No subscription needed |
The 5-hour token quota system allocates 300-1,200 API calls per window with max concurrency of 30.
Kimi CLI vs Claude Code vs Codex CLI
| Feature | Kimi CLI | Claude Code | Codex CLI |
|---|---|---|---|
| Agent Swarm | β Up to 100 parallel | β Sequential | β Sequential |
| Plan mode | β Read-only planning | β | β |
| Web search | β Built-in | β | β |
| Model flexibility | β Any provider | β Claude only | β GPT only |
| Price | $0.60/1M input | $20/mo (Pro sub) | $20/mo (Plus sub) |
| Best model | Kimi K2.5 (1T MoE) | Claude Opus 4.6 | GPT-5.4 |
| Code quality | Very good | Best | Very good |
| Open source | β | β | β |
Kimi CLI wins on flexibility and parallelism. Claude Code wins on raw code quality. Codex CLI wins on speed.
Tips for best results
- Use plan mode first for complex tasks β review the plan before letting Kimi execute
- Enable Agent Swarm for batch operations β itβs dramatically faster
- Set custom rules for your projectβs coding standards
- Use
--readflag for context-only files that shouldnβt be edited - Combine with Aider β use Kimi for planning, Aider for execution with different models
Troubleshooting
βRate limit exceededβ: The 5-hour quota system can be restrictive. Upgrade to Pro or use pay-as-you-go for heavy sessions.
βModel not availableβ: Check your subscription tier. Some models require Pro membership.
Slow responses: Kimiβs servers are primarily in Asia. Use OpenRouter for potentially lower latency from Western locations.
FAQ
Is Kimi CLI free?
Yes, Kimi CLI offers a generous free tier with daily usage limits. For heavier usage, paid plans are available through Moonshot AIβs platform.
Does Kimi CLI work offline?
No, Kimi CLI requires an internet connection to communicate with Moonshot AIβs servers. All inference happens in the cloud, so you need connectivity to use it.
How does Kimi compare to Claude Code?
Kimi CLI offers comparable coding capabilities at a significantly lower price point, especially for routine tasks. Claude Code tends to perform better on complex architectural decisions, but Kimi excels at speed and cost efficiency.
Can I use Kimi with local models?
Kimi CLI is designed to work with Moonshot AIβs cloud models and does not natively support local model backends. However, you can use the underlying Kimi K2.5 model locally since itβs MIT-licensed.
Related: Kimi K2.5 Complete Guide Β· Claude Code vs Codex CLI vs Gemini CLI Β· Aider Complete Guide