Apr 14, 2026 · 4 min read

Last updated on Apr 21, 2026

Kimi CLI Complete Guide — Moonshot's Terminal AI Coding Agent

Kimi CLI is an open-source terminal coding agent from Moonshot AI, powered by the Kimi K2.5 model. It competes directly with Claude Code and Codex CLI but with a unique advantage: Agent Swarm, which coordinates up to 100 parallel sub-agents for faster execution.

Update (April 21, 2026): Kimi CLI now supports Kimi K2.6, which scales Agent Swarm to 300 sub-agents and scores 80.2% on SWE-Bench Verified. Moonshot recommends Kimi Code CLI as the primary agent framework for K2.6.

Installation

npm install -g @anthropic-ai/kimi-cli

Or with Homebrew:

brew install kimi-cli

Authentication

Kimi CLI supports multiple auth methods:

Device auth (recommended)

kimi login --device-auth

This opens a browser window where you log in to your Moonshot account. Credentials are stored locally.

API key

export KIMI_API_KEY="your-api-key"
kimi

Using other providers

Kimi CLI isn’t locked to Kimi models. You can configure it to use Claude, GPT, or any OpenAI-compatible API:

# Use with OpenRouter
export OPENROUTER_API_KEY="your-key"
kimi --provider openrouter --model anthropic/claude-opus-4.6

See the OpenRouter guide for model options.

Core features

Chat mode

kimi
> Fix the authentication bug in src/auth.ts

Kimi reads your codebase, understands the context, and makes edits directly. Every change is shown as a diff before applying.

Plan mode

Plan mode is read-only — Kimi analyzes your codebase without making changes:

kimi --plan
> How should I refactor the database layer to support PostgreSQL?

It uses read-only tools (Glob, Grep, ReadFile) to explore the code, then writes a detailed plan for your approval. You can approve, reject, or request revisions before any code is touched.

Agent Swarm

The killer feature. For parallelizable tasks:

kimi
> Refactor all 30 API route handlers to use the new middleware pattern

Kimi spawns multiple sub-agents that work in parallel, each handling a subset of files. The coordinator merges results and resolves conflicts. This is 4.5x faster than sequential processing on large tasks.

Shell commands

Kimi can execute shell commands directly:

kimi
> Run the test suite and fix any failing tests

It runs npm test, reads the output, identifies failures, fixes the code, and re-runs until tests pass.

Web search

kimi
> Search for the latest React 19 migration guide and apply the changes to our project

Kimi can fetch web pages and documentation to inform its code changes — useful when working with new APIs or frameworks.

Configuration

Kimi CLI stores config in ~/.kimi/:

~/.kimi/
├── sessions/       # Auth sessions
├── config.json     # Global settings
└── rules/          # Custom rules

Custom rules

Create project-specific rules in .kimi/rules/:

# .kimi/rules/style.md
- Use TypeScript strict mode
- Prefer functional components
- Always add JSDoc comments to exported functions
- Use Tailwind CSS for styling

Pricing

Kimi CLI itself is free and open-source. You pay for the model API:

Plan	Price	Includes
Free tier	$0	Limited requests/day
Kimi Code Lite	~$3/month	Basic quota
Kimi Code Pro	~$19/month	Higher quota + priority
Pay-as-you-go	$0.60/$2.50 per 1M tokens	No subscription needed

The 5-hour token quota system allocates 300-1,200 API calls per window with max concurrency of 30.

Kimi CLI vs Claude Code vs Codex CLI

Feature	Kimi CLI	Claude Code	Codex CLI
Agent Swarm	✅ Up to 100 parallel	❌ Sequential	❌ Sequential
Plan mode	✅ Read-only planning	❌	❌
Web search	✅ Built-in	❌	❌
Model flexibility	✅ Any provider	❌ Claude only	❌ GPT only
Price	$0.60/1M input	$20/mo (Pro sub)	$20/mo (Plus sub)
Best model	Kimi K2.5 (1T MoE)	Claude Opus 4.6	GPT-5.4
Code quality	Very good	Best	Very good
Open source	✅	❌	✅

Kimi CLI wins on flexibility and parallelism. Claude Code wins on raw code quality. Codex CLI wins on speed.

Tips for best results

Use plan mode first for complex tasks — review the plan before letting Kimi execute
Enable Agent Swarm for batch operations — it’s dramatically faster
Set custom rules for your project’s coding standards
Use --read flag for context-only files that shouldn’t be edited
Combine with Aider — use Kimi for planning, Aider for execution with different models

Troubleshooting

“Rate limit exceeded”: The 5-hour quota system can be restrictive. Upgrade to Pro or use pay-as-you-go for heavy sessions.

“Model not available”: Check your subscription tier. Some models require Pro membership.

Slow responses: Kimi’s servers are primarily in Asia. Use OpenRouter for potentially lower latency from Western locations.

FAQ

Is Kimi CLI free?

Yes, Kimi CLI offers a generous free tier with daily usage limits. For heavier usage, paid plans are available through Moonshot AI’s platform.

Does Kimi CLI work offline?

No, Kimi CLI requires an internet connection to communicate with Moonshot AI’s servers. All inference happens in the cloud, so you need connectivity to use it.

How does Kimi compare to Claude Code?

Kimi CLI offers comparable coding capabilities at a significantly lower price point, especially for routine tasks. Claude Code tends to perform better on complex architectural decisions, but Kimi excels at speed and cost efficiency.

Can I use Kimi with local models?

Kimi CLI is designed to work with Moonshot AI’s cloud models and does not natively support local model backends. However, you can use the underlying Kimi K2.5 model locally since it’s MIT-licensed.