Apr 26, 2026 · 4 min read

Last updated on Apr 19, 2026

Yi API Setup Guide — Yi-Lightning, Yi-Coder, and Yi-34B (2026)

01.AI offers API access to their Yi model family. The flagship Yi-Lightning ranked 6th on Chatbot Arena. Here’s how to set it up.

Available via API

Model	Access	Best for
Yi-Lightning	01.AI platform	Best quality, flagship
Yi-Large	01.AI platform	Large context tasks
Yi-Coder 9B	Local or HuggingFace	Coding (free locally)
Yi-34B	Local or HuggingFace	General purpose (free locally)

Setup via 01.AI platform

from openai import OpenAI

client = OpenAI(
    base_url="https://api.01.ai/v1",
    api_key="your-01ai-key",
)

response = client.chat.completions.create(
    model="yi-lightning",
    messages=[{"role": "user", "content": "Write a Python web scraper with error handling"}],
)
print(response.choices[0].message.content)

Setup via OpenRouter

Some Yi models are available on OpenRouter:

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="your-openrouter-key",
)

response = client.chat.completions.create(
    model="01-ai/yi-large",
    messages=[{"role": "user", "content": "Explain Kubernetes networking"}],
)

With coding tools

Aider

# Via 01.AI API
export OPENAI_API_BASE=https://api.01.ai/v1
export OPENAI_API_KEY=your-01ai-key
aider --model yi-lightning

# Via local (free)
aider --model ollama/yi-coder:9b

Continue.dev

{
  "models": [{
    "title": "Yi-Lightning",
    "provider": "openai",
    "model": "yi-lightning",
    "apiBase": "https://api.01.ai/v1",
    "apiKey": "your-key"
  }]
}

Pricing

Model	Input	Output
Yi-Lightning	~$0.99/M tokens	~$0.99/M tokens
Yi-Large	~$3/M tokens	~$3/M tokens
Yi-Coder 9B (local)	Free	Free
Yi-34B (local)	Free	Free

For most developers, running Yi-Coder locally is the best value. Use Yi-Lightning via API only when you need the flagship quality.

Yi API vs alternatives

Provider	Best model	Cost	Unique feature
01.AI	Yi-Lightning	~$1/M	Strong Chinese + coding
Z.ai	GLM-5.1	$18/mo flat	Claude Code compatible
OpenRouter	Qwen 3.6	Free (preview)	300+ models, one key
DeepSeek	DeepSeek V3	$0.27/$1.10	Cheapest paid
Anthropic	Claude Sonnet	$3/$15	Best coding quality

Yi-Lightning is competitive on price but the 01.AI platform is less mature than OpenRouter or Anthropic. For most developers, running Yi-Coder locally + using Qwen 3.6 free on OpenRouter covers all needs at zero cost.

Streaming responses

For real-time output in chat interfaces:

stream = client.chat.completions.create(
    model="yi-lightning",
    messages=[{"role": "user", "content": "Write a Python web scraper"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Error handling

import time
from openai import OpenAI, APIError, RateLimitError

client = OpenAI(base_url="https://api.01.ai/v1", api_key="your-key")

def call_yi(prompt, retries=3):
    for attempt in range(retries):
        try:
            response = client.chat.completions.create(
                model="yi-lightning",
                messages=[{"role": "user", "content": prompt}],
                timeout=30,
            )
            return response.choices[0].message.content
        except RateLimitError:
            time.sleep(2 ** attempt)
        except APIError as e:
            if attempt == retries - 1:
                raise
            time.sleep(1)
    return None

When to use API vs local

Scenario	Use API (Yi-Lightning)	Use local (Yi-Coder)
Need best quality	✅
Need privacy		✅
Budget-conscious		✅ (free)
No GPU/weak hardware	✅
Offline work		✅
Chinese language tasks	✅ (best quality)	Good
Code autocomplete		✅ (1.5B, instant)

The practical setup: Yi-Coder 9B locally for daily coding (free, private), Yi-Lightning API for complex tasks that need flagship quality.

Yi API vs the competition

Provider	Model	Input cost	Output cost	Strength
01.AI	Yi-Lightning	~$0.99/M	~$0.99/M	Chinese + coding
Z.ai	GLM-5.1	$18/mo flat	Included	Claude Code compatible
OpenRouter	Qwen 3.6	Free	Free	300+ models
DeepSeek	V3	$0.27/M	$1.10/M	Cheapest paid
Anthropic	Sonnet	$3/M	$15/M	Best coding

For most developers, the free options (Qwen 3.6 on OpenRouter, Yi-Coder locally) are sufficient. Pay for Yi-Lightning only when you specifically need its Chinese language strength or flagship reasoning.

FAQ

Is the Yi API free?

The Yi-Lightning and Yi-Large APIs are paid (around $0.99–$3/M tokens), but Yi-Coder 9B and Yi-34B can be run locally for free using Ollama or other local inference tools. For most developers, the local option covers daily needs at zero cost.

How do I get a Yi API key?

Sign up at platform.01.ai, create an account, and generate an API key from the dashboard. The key works with their OpenAI-compatible endpoint at https://api.01.ai/v1.

Which Yi model is best?

Yi-Lightning is the flagship with the best quality (ranked 6th on Chatbot Arena), while Yi-Coder 9B is the best for coding tasks and runs free locally. Use Yi-Lightning via API for complex reasoning and Yi-Coder locally for daily development work.

Can I use Yi with OpenAI-compatible clients?

Yes, the Yi API uses an OpenAI-compatible interface. You can use the standard OpenAI Python SDK by pointing base_url to https://api.01.ai/v1 — any tool that supports custom OpenAI endpoints (Aider, Continue.dev, etc.) works out of the box.