Apr 14, 2026 · 4 min read

Last updated on May 22, 2026

How to Use Qwen 3.6 Plus API — OpenRouter, Aliyun, and Coding Tools Setup

May 2026 Update: Looking for the latest? See How to Use the Qwen 3.7 API for updated setup instructions.

Qwen 3.6 Plus is currently available through two channels: OpenRouter (free preview) and Aliyun BaiLian API (production). Here’s how to set up both, plus integration with popular coding tools.

Option 1: OpenRouter (free, 30 seconds)

The fastest way to start. No credit card needed.

Sign up at openrouter.ai
Go to Keys > Create Key
Use this code:

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-your-key-here",
)

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[{"role": "user", "content": "Write a Python function to parse CSV files"}],
    max_tokens=4096,
)
print(response.choices[0].message.content)

Limits: Free tier has rate limits. For heavy usage, add credits to your OpenRouter account.

Option 2: Aliyun BaiLian API (production)

For production use with SLAs:

Create an Aliyun account at aliyun.com
Enable the BaiLian (DashScope) service
Generate an API key in the console

from openai import OpenAI

client = OpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key="sk-your-aliyun-key",
)

response = client.chat.completions.create(
    model="qwen-plus-2026-0330",
    messages=[{"role": "user", "content": "Review this code for bugs"}],
)

Using with coding tools

Aider

# Via OpenRouter (free)
export OPENROUTER_API_KEY=sk-or-v1-your-key
aider --model openrouter/qwen/qwen3.6-plus:free

# Via Aliyun
export OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1
export OPENAI_API_KEY=sk-your-aliyun-key
aider --model qwen-plus-2026-0330

See our Aider Complete Guide for advanced configuration.

Continue.dev (VS Code)

Add to your .continue/config.json:

{
  "models": [{
    "title": "Qwen 3.6 Plus",
    "provider": "openai",
    "model": "qwen/qwen3.6-plus:free",
    "apiBase": "https://openrouter.ai/api/v1",
    "apiKey": "sk-or-v1-your-key"
  }]
}

See our Continue.dev Guide for full setup.

OpenCode

export OPENAI_API_BASE=https://openrouter.ai/api/v1
export OPENAI_API_KEY=sk-or-v1-your-key
opencode --provider openai --model qwen/qwen3.6-plus:free

See our OpenCode Guide for more options.

Using the 1M context window

The 1M context window is Qwen 3.6 Plus’s killer feature. Here’s how to use it effectively:

# Feed an entire codebase for review
import os

def collect_code(directory, extensions=['.py', '.js', '.ts']):
    code = []
    for root, dirs, files in os.walk(directory):
        for f in files:
            if any(f.endswith(ext) for ext in extensions):
                path = os.path.join(root, f)
                with open(path) as fh:
                    code.append(f"### {path}\n```\n{fh.read()}\n```")
    return "\n\n".join(code)

codebase = collect_code("./src")

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[
        {"role": "system", "content": "You are a senior code reviewer."},
        {"role": "user", "content": f"Review this codebase for security issues:\n\n{codebase}"}
    ],
    max_tokens=65536,
)

Tip: Even with 1M context, be selective. Sending only relevant files is faster and produces better results than dumping everything. See our context engineering guide for best practices.

Using preserve_thinking for debugging

Qwen 3.6 Plus has a preserve_thinking parameter that shows the model’s reasoning:

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[{"role": "user", "content": "Why is this function returning None?"}],
    extra_body={"preserve_thinking": True},
)
# Response includes the model's step-by-step reasoning

This is invaluable for debugging agent workflows — you can see exactly why the model chose a specific approach.

Streaming

For real-time output in chat interfaces:

stream = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[{"role": "user", "content": "Explain async/await in Python"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Rate limits and pricing

Provider	Rate limit	Price	Best for
OpenRouter (free)	~20 req/min	$0	Testing, personal projects
OpenRouter (paid)	Higher	Pay per token	Production with flexibility
Aliyun BaiLian	Configurable	Standard Qwen Plus rates	Production with SLA

The free OpenRouter tier is generous enough for personal coding use. For team or production use, either add credits to OpenRouter or use the Aliyun API directly.

Can I run it locally?

Not yet. Qwen 3.6 Plus is API-only. For local inference, use Qwen 3.5 (available in sizes from 0.6B to 32B via Ollama).

When open weights are released, we’ll update this guide with local setup instructions.

Error handling

API calls fail. Handle it gracefully:

import time
from openai import OpenAI, APIError, RateLimitError

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-your-key",
)

def call_qwen(prompt, retries=3):
    for attempt in range(retries):
        try:
            response = client.chat.completions.create(
                model="qwen/qwen3.6-plus:free",
                messages=[{"role": "user", "content": prompt}],
                max_tokens=4096,
                timeout=60,
            )
            return response.choices[0].message.content
        except RateLimitError:
            wait = 2 ** attempt
            print(f"Rate limited, waiting {wait}s...")
            time.sleep(wait)
        except APIError as e:
            if attempt == retries - 1:
                raise
            time.sleep(1)
    return None

Common errors on the free tier:

429 Rate Limited — wait and retry, or add credits
503 Service Unavailable — model is overloaded, try again in a few minutes
Timeout — long context requests can take 30-60s, increase your timeout

Comparing API options

Feature	OpenRouter Free	OpenRouter Paid	Aliyun BaiLian
Price	$0	Per token	Per token
Rate limit	~20 req/min	Higher	Configurable
SLA	None	None	Yes
Regions	US/EU	US/EU	China (+ global)
Best for	Testing, personal	Production (flexible)	Production (SLA)
Other models	200+ models	200+ models	Qwen family only

If you’re already using OpenRouter for other models, stick with it. You get one API key for Qwen 3.6, Claude, GPT, DeepSeek, and everything else.

Building a simple coding assistant

Here’s a minimal but functional coding assistant using Qwen 3.6 Plus:

import sys
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-your-key",
)

def code_review(filepath):
    with open(filepath) as f:
        code = f.read()
    
    stream = client.chat.completions.create(
        model="qwen/qwen3.6-plus:free",
        messages=[
            {"role": "system", "content": "Review this code. Focus on bugs, security issues, and performance. Be concise."},
            {"role": "user", "content": f"```\n{code}\n```"}
        ],
        max_tokens=4096,
        stream=True,
    )
    
    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")

if __name__ == "__main__":
    code_review(sys.argv[1])

Usage: python review.py src/main.py

Free, private (no data stored on OpenRouter), and uses the 1M context window if your file is large.