🤖 AI Tools
· 4 min read

How to Use Qwen 3.6 Plus API — OpenRouter, Aliyun, and Coding Tools Setup


Qwen 3.6 Plus is currently available through two channels: OpenRouter (free preview) and Aliyun BaiLian API (production). Here’s how to set up both, plus integration with popular coding tools.

Option 1: OpenRouter (free, 30 seconds)

The fastest way to start. No credit card needed.

  1. Sign up at openrouter.ai
  2. Go to Keys > Create Key
  3. Use this code:
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-your-key-here",
)

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[{"role": "user", "content": "Write a Python function to parse CSV files"}],
    max_tokens=4096,
)
print(response.choices[0].message.content)

Limits: Free tier has rate limits. For heavy usage, add credits to your OpenRouter account.

Option 2: Aliyun BaiLian API (production)

For production use with SLAs:

  1. Create an Aliyun account at aliyun.com
  2. Enable the BaiLian (DashScope) service
  3. Generate an API key in the console
from openai import OpenAI

client = OpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key="sk-your-aliyun-key",
)

response = client.chat.completions.create(
    model="qwen-plus-2026-0330",
    messages=[{"role": "user", "content": "Review this code for bugs"}],
)

Using with coding tools

Aider

# Via OpenRouter (free)
export OPENROUTER_API_KEY=sk-or-v1-your-key
aider --model openrouter/qwen/qwen3.6-plus:free

# Via Aliyun
export OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1
export OPENAI_API_KEY=sk-your-aliyun-key
aider --model qwen-plus-2026-0330

See our Aider Complete Guide for advanced configuration.

Continue.dev (VS Code)

Add to your .continue/config.json:

{
  "models": [{
    "title": "Qwen 3.6 Plus",
    "provider": "openai",
    "model": "qwen/qwen3.6-plus:free",
    "apiBase": "https://openrouter.ai/api/v1",
    "apiKey": "sk-or-v1-your-key"
  }]
}

See our Continue.dev Guide for full setup.

OpenCode

export OPENAI_API_BASE=https://openrouter.ai/api/v1
export OPENAI_API_KEY=sk-or-v1-your-key
opencode --provider openai --model qwen/qwen3.6-plus:free

See our OpenCode Guide for more options.

Using the 1M context window

The 1M context window is Qwen 3.6 Plus’s killer feature. Here’s how to use it effectively:

# Feed an entire codebase for review
import os

def collect_code(directory, extensions=['.py', '.js', '.ts']):
    code = []
    for root, dirs, files in os.walk(directory):
        for f in files:
            if any(f.endswith(ext) for ext in extensions):
                path = os.path.join(root, f)
                with open(path) as fh:
                    code.append(f"### {path}\n```\n{fh.read()}\n```")
    return "\n\n".join(code)

codebase = collect_code("./src")

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[
        {"role": "system", "content": "You are a senior code reviewer."},
        {"role": "user", "content": f"Review this codebase for security issues:\n\n{codebase}"}
    ],
    max_tokens=65536,
)

Tip: Even with 1M context, be selective. Sending only relevant files is faster and produces better results than dumping everything. See our context engineering guide for best practices.

Using preserve_thinking for debugging

Qwen 3.6 Plus has a preserve_thinking parameter that shows the model’s reasoning:

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[{"role": "user", "content": "Why is this function returning None?"}],
    extra_body={"preserve_thinking": True},
)
# Response includes the model's step-by-step reasoning

This is invaluable for debugging agent workflows — you can see exactly why the model chose a specific approach.

Streaming

For real-time output in chat interfaces:

stream = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[{"role": "user", "content": "Explain async/await in Python"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Rate limits and pricing

ProviderRate limitPriceBest for
OpenRouter (free)~20 req/min$0Testing, personal projects
OpenRouter (paid)HigherPay per tokenProduction with flexibility
Aliyun BaiLianConfigurableStandard Qwen Plus ratesProduction with SLA

The free OpenRouter tier is generous enough for personal coding use. For team or production use, either add credits to OpenRouter or use the Aliyun API directly.

Can I run it locally?

Not yet. Qwen 3.6 Plus is API-only. For local inference, use Qwen 3.5 (available in sizes from 0.6B to 32B via Ollama).

When open weights are released, we’ll update this guide with local setup instructions.

Error handling

API calls fail. Handle it gracefully:

import time
from openai import OpenAI, APIError, RateLimitError

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-your-key",
)

def call_qwen(prompt, retries=3):
    for attempt in range(retries):
        try:
            response = client.chat.completions.create(
                model="qwen/qwen3.6-plus:free",
                messages=[{"role": "user", "content": prompt}],
                max_tokens=4096,
                timeout=60,
            )
            return response.choices[0].message.content
        except RateLimitError:
            wait = 2 ** attempt
            print(f"Rate limited, waiting {wait}s...")
            time.sleep(wait)
        except APIError as e:
            if attempt == retries - 1:
                raise
            time.sleep(1)
    return None

Common errors on the free tier:

  • 429 Rate Limited — wait and retry, or add credits
  • 503 Service Unavailable — model is overloaded, try again in a few minutes
  • Timeout — long context requests can take 30-60s, increase your timeout

Comparing API options

FeatureOpenRouter FreeOpenRouter PaidAliyun BaiLian
Price$0Per tokenPer token
Rate limit~20 req/minHigherConfigurable
SLANoneNoneYes
RegionsUS/EUUS/EUChina (+ global)
Best forTesting, personalProduction (flexible)Production (SLA)
Other models200+ models200+ modelsQwen family only

If you’re already using OpenRouter for other models, stick with it. You get one API key for Qwen 3.6, Claude, GPT, DeepSeek, and everything else.

Building a simple coding assistant

Here’s a minimal but functional coding assistant using Qwen 3.6 Plus:

import sys
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-your-key",
)

def code_review(filepath):
    with open(filepath) as f:
        code = f.read()
    
    stream = client.chat.completions.create(
        model="qwen/qwen3.6-plus:free",
        messages=[
            {"role": "system", "content": "Review this code. Focus on bugs, security issues, and performance. Be concise."},
            {"role": "user", "content": f"```\n{code}\n```"}
        ],
        max_tokens=4096,
        stream=True,
    )
    
    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")

if __name__ == "__main__":
    code_review(sys.argv[1])

Usage: python review.py src/main.py

Free, private (no data stored on OpenRouter), and uses the 1M context window if your file is large.

Related: Qwen 3.6 Complete Guide · Qwen 3.6 vs 3.5 · How to Run Qwen 3.5 Locally · OpenRouter Complete Guide · Aider Complete Guide