Qwen 3.6 Plus is currently available through two channels: OpenRouter (free preview) and Aliyun BaiLian API (production). Here’s how to set up both, plus integration with popular coding tools.
Option 1: OpenRouter (free, 30 seconds)
The fastest way to start. No credit card needed.
- Sign up at openrouter.ai
- Go to Keys > Create Key
- Use this code:
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-v1-your-key-here",
)
response = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[{"role": "user", "content": "Write a Python function to parse CSV files"}],
max_tokens=4096,
)
print(response.choices[0].message.content)
Limits: Free tier has rate limits. For heavy usage, add credits to your OpenRouter account.
Option 2: Aliyun BaiLian API (production)
For production use with SLAs:
- Create an Aliyun account at aliyun.com
- Enable the BaiLian (DashScope) service
- Generate an API key in the console
from openai import OpenAI
client = OpenAI(
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
api_key="sk-your-aliyun-key",
)
response = client.chat.completions.create(
model="qwen-plus-2026-0330",
messages=[{"role": "user", "content": "Review this code for bugs"}],
)
Using with coding tools
Aider
# Via OpenRouter (free)
export OPENROUTER_API_KEY=sk-or-v1-your-key
aider --model openrouter/qwen/qwen3.6-plus:free
# Via Aliyun
export OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1
export OPENAI_API_KEY=sk-your-aliyun-key
aider --model qwen-plus-2026-0330
See our Aider Complete Guide for advanced configuration.
Continue.dev (VS Code)
Add to your .continue/config.json:
{
"models": [{
"title": "Qwen 3.6 Plus",
"provider": "openai",
"model": "qwen/qwen3.6-plus:free",
"apiBase": "https://openrouter.ai/api/v1",
"apiKey": "sk-or-v1-your-key"
}]
}
See our Continue.dev Guide for full setup.
OpenCode
export OPENAI_API_BASE=https://openrouter.ai/api/v1
export OPENAI_API_KEY=sk-or-v1-your-key
opencode --provider openai --model qwen/qwen3.6-plus:free
See our OpenCode Guide for more options.
Using the 1M context window
The 1M context window is Qwen 3.6 Plus’s killer feature. Here’s how to use it effectively:
# Feed an entire codebase for review
import os
def collect_code(directory, extensions=['.py', '.js', '.ts']):
code = []
for root, dirs, files in os.walk(directory):
for f in files:
if any(f.endswith(ext) for ext in extensions):
path = os.path.join(root, f)
with open(path) as fh:
code.append(f"### {path}\n```\n{fh.read()}\n```")
return "\n\n".join(code)
codebase = collect_code("./src")
response = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[
{"role": "system", "content": "You are a senior code reviewer."},
{"role": "user", "content": f"Review this codebase for security issues:\n\n{codebase}"}
],
max_tokens=65536,
)
Tip: Even with 1M context, be selective. Sending only relevant files is faster and produces better results than dumping everything. See our context engineering guide for best practices.
Using preserve_thinking for debugging
Qwen 3.6 Plus has a preserve_thinking parameter that shows the model’s reasoning:
response = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[{"role": "user", "content": "Why is this function returning None?"}],
extra_body={"preserve_thinking": True},
)
# Response includes the model's step-by-step reasoning
This is invaluable for debugging agent workflows — you can see exactly why the model chose a specific approach.
Streaming
For real-time output in chat interfaces:
stream = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[{"role": "user", "content": "Explain async/await in Python"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Rate limits and pricing
| Provider | Rate limit | Price | Best for |
|---|---|---|---|
| OpenRouter (free) | ~20 req/min | $0 | Testing, personal projects |
| OpenRouter (paid) | Higher | Pay per token | Production with flexibility |
| Aliyun BaiLian | Configurable | Standard Qwen Plus rates | Production with SLA |
The free OpenRouter tier is generous enough for personal coding use. For team or production use, either add credits to OpenRouter or use the Aliyun API directly.
Can I run it locally?
Not yet. Qwen 3.6 Plus is API-only. For local inference, use Qwen 3.5 (available in sizes from 0.6B to 32B via Ollama).
When open weights are released, we’ll update this guide with local setup instructions.
Error handling
API calls fail. Handle it gracefully:
import time
from openai import OpenAI, APIError, RateLimitError
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-v1-your-key",
)
def call_qwen(prompt, retries=3):
for attempt in range(retries):
try:
response = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[{"role": "user", "content": prompt}],
max_tokens=4096,
timeout=60,
)
return response.choices[0].message.content
except RateLimitError:
wait = 2 ** attempt
print(f"Rate limited, waiting {wait}s...")
time.sleep(wait)
except APIError as e:
if attempt == retries - 1:
raise
time.sleep(1)
return None
Common errors on the free tier:
- 429 Rate Limited — wait and retry, or add credits
- 503 Service Unavailable — model is overloaded, try again in a few minutes
- Timeout — long context requests can take 30-60s, increase your timeout
Comparing API options
| Feature | OpenRouter Free | OpenRouter Paid | Aliyun BaiLian |
|---|---|---|---|
| Price | $0 | Per token | Per token |
| Rate limit | ~20 req/min | Higher | Configurable |
| SLA | None | None | Yes |
| Regions | US/EU | US/EU | China (+ global) |
| Best for | Testing, personal | Production (flexible) | Production (SLA) |
| Other models | 200+ models | 200+ models | Qwen family only |
If you’re already using OpenRouter for other models, stick with it. You get one API key for Qwen 3.6, Claude, GPT, DeepSeek, and everything else.
Building a simple coding assistant
Here’s a minimal but functional coding assistant using Qwen 3.6 Plus:
import sys
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-v1-your-key",
)
def code_review(filepath):
with open(filepath) as f:
code = f.read()
stream = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[
{"role": "system", "content": "Review this code. Focus on bugs, security issues, and performance. Be concise."},
{"role": "user", "content": f"```\n{code}\n```"}
],
max_tokens=4096,
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
if __name__ == "__main__":
code_review(sys.argv[1])
Usage: python review.py src/main.py
Free, private (no data stored on OpenRouter), and uses the 1M context window if your file is large.
Related: Qwen 3.6 Complete Guide · Qwen 3.6 vs 3.5 · How to Run Qwen 3.5 Locally · OpenRouter Complete Guide · Aider Complete Guide