Qwen 3.6 Plus: Free 1M Context Model That Beats GPT-5 on Coding (2026)
May 2026 Update: Qwen 3.7 Max is now available with a 56.6 Intelligence Index score and 1M context. See our Qwen 3.7 Complete Guide for the latest.
Qwen 3.6 Plus is Alibaba’s latest flagship model, released March 30, 2026. It features a 1M token context window, hybrid linear attention + MoE architecture, and scores 78.8% on SWE-bench Verified. It’s currently free on OpenRouter.
Update (April 27, 2026): The Qwen 3.6 family now includes Flash (speed-optimized, $0.25/1M input) and Max Preview (new flagship, tops 6 coding benchmarks, AA Index 52).
Update (April 23, 2026): Alibaba released Qwen 3.6-27B, a 27B dense model that scores 77.2% on SWE-bench Verified, beating the 397B flagship. Runs on a Mac with 22GB VRAM. See our 27B complete guide.
⚡ Update (April 17, 2026): Alibaba has released Qwen 3.6-35B-A3B — the first open-weight model in the 3.6 generation. It’s a 35B MoE with only 3B active parameters, scores 73.4% on SWE-bench Verified, runs on a laptop (~21 GB quantized), and is Apache 2.0 licensed. If you want to run Qwen 3.6 locally, that’s the one to use.
Key specs
| Spec | Value |
|---|---|
| Developer | Alibaba (Tongyi Lab) |
| Release date | March 30, 2026 |
| Architecture | Hybrid linear attention + sparse MoE |
| Context window | 1M tokens (256K native, extended via YaRN) |
| Max output | 65,536 tokens |
| Chain-of-thought | Always-on |
| SWE-bench Verified | 78.8% |
| Terminal-Bench 2.0 | 61.6% |
| MCPMark | 48.2% |
| Price (OpenRouter) | Free (preview) |
| Price (Aliyun) | Standard Qwen Plus pricing |
| Open weights | Not yet (API-only) |
| License | Proprietary (API access) |
What it’s good at
Qwen 3.6 Plus was specifically optimized for agentic coding workflows:
- Repository-level coding — understands entire codebases, not just single files
- Front-end generation — HTML/CSS/JS from natural language descriptions
- Code repair — finds and fixes bugs across multiple files
- Terminal automation — executes commands and interprets output
- Tool calling — reliable MCP and function calling
- Long document analysis — 1M context handles entire books, transcripts, or codebases
What it’s not good at
- Not available locally — no Ollama or GGUF downloads yet
- Not a chat model — optimized for technical tasks, not conversational AI
- Peak hour limitations — Aliyun API may have rate limits during high demand
- Multimodal — text-only for now (Qwen 3.5 Omni handles vision/audio)
How to use it
OpenRouter (free, easiest)
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="your-openrouter-key",
)
response = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[
{"role": "system", "content": "You are a senior software engineer."},
{"role": "user", "content": "Review this code for security issues:\n\n```python\n...```"}
],
max_tokens=65536,
)
print(response.choices[0].message.content)
Sign up at openrouter.ai for a free API key.
Aliyun BaiLian API (production)
For production use with SLAs and guaranteed rate limits:
from openai import OpenAI
client = OpenAI(
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
api_key="your-aliyun-key",
)
response = client.chat.completions.create(
model="qwen-plus-2026-0330",
messages=[{"role": "user", "content": "..."}],
)
With Aider
# Via OpenRouter
aider --model openrouter/qwen/qwen3.6-plus:free
# Via Aliyun
export OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1
export OPENAI_API_KEY=your-aliyun-key
aider --model qwen-plus-2026-0330
See our Aider Complete Guide for full setup.
With Continue.dev
In your Continue config, add:
{
"models": [{
"provider": "openai",
"model": "qwen/qwen3.6-plus:free",
"apiBase": "https://openrouter.ai/api/v1",
"apiKey": "your-openrouter-key"
}]
}
See our Continue.dev Guide for full setup.
Benchmarks compared
| Model | SWE-bench | Terminal-Bench | MCPMark | Context | Price |
|---|---|---|---|---|---|
| Qwen 3.6 Plus | 78.8% | 61.6% | 48.2% | 1M | Free* |
| Claude Opus 4.5 | ~80% | 59.3% | ~50% | 200K | $15/$75 |
| Claude Sonnet 4.6 | ~75% | ~55% | ~45% | 200K | $3/$15 |
| GPT-5 | ~72% | ~52% | ~40% | 128K | $5/$15 |
| DeepSeek R1 | ~70% | ~48% | ~35% | 128K | $0.55/$2.19 |
| Gemini 2.5 Pro | ~73% | ~50% | ~42% | 1M | $1.25/$10 |
*Free on OpenRouter preview. Production pricing via Aliyun.
Qwen 3.6 Plus is the only model that beats Claude Opus 4.5 on Terminal-Bench while being free. The SWE-bench score of 78.8% puts it in the top tier alongside Claude.
The preserve_thinking parameter
Qwen 3.6 Plus introduces a preserve_thinking parameter for agent workflows. When enabled, the model’s chain-of-thought reasoning is included in the response, letting you see why it made specific decisions:
response = client.chat.completions.create(
model="qwen/qwen3.6-plus:free",
messages=[{"role": "user", "content": "Debug this error..."}],
extra_body={"preserve_thinking": True},
)
This is useful for debugging agent loops and understanding why the model chose a specific approach.
Architecture: hybrid linear attention + MoE
The key innovation in Qwen 3.6 Plus is combining two efficiency techniques:
- Linear attention — reduces the quadratic cost of standard attention to linear, enabling the 1M context window without proportional memory increase
- Sparse MoE — only activates a subset of parameters per token, keeping inference fast despite the large total parameter count
The result is a model that’s roughly 3x faster than Claude Opus 4.6 in community benchmarks while maintaining competitive quality.
Pricing
| Provider | Input | Output | Free tier |
|---|---|---|---|
| OpenRouter | Free (preview) | Free (preview) | Yes |
| Aliyun BaiLian | Standard Qwen Plus rates | Standard rates | Trial credits |
The free OpenRouter preview won’t last forever. For production use, set up the Aliyun API now so you’re ready when the preview ends.
vs Qwen 3.5
See our detailed Qwen 3.6 vs 3.5 comparison for the full breakdown. The short version: 3.6 Plus is better at everything, but it’s API-only. If you need to run locally, Qwen 3.5 is still your best option.
FAQ
Is Qwen 3.6 Plus free?
Yes, Qwen 3.6 Plus is currently free on OpenRouter during the preview period. For production use with SLAs, you’ll need the Aliyun BaiLian API which uses standard Qwen Plus pricing.
How does Qwen 3.6 compare to Claude?
Qwen 3.6 Plus scores 78.8% on SWE-bench Verified, close to Claude Opus 4.5 (~80%) and above Claude Sonnet 4.6 (~75%). It beats Claude Opus 4.5 on Terminal-Bench (61.6% vs 59.3%) and offers a 1M token context window vs Claude’s 200K. The biggest advantage is price — it’s free on OpenRouter. See our AI model comparison for a full breakdown.
Can I use Qwen 3.6 with Aider?
Yes. Use aider --model openrouter/qwen/qwen3.6-plus:free via OpenRouter, or point Aider at the Aliyun API. See the setup instructions above or our Aider Complete Guide.
What’s the context window of Qwen 3.6?
Qwen 3.6 Plus has a 1M token context window (256K native, extended via YaRN) with a max output of 65,536 tokens. If you want to run Qwen 3.6 locally, the open-weight Qwen 3.6-35B-A3B supports 128K context.
Related: Qwen 3.6 vs 3.5 — What Changed · How to Run Qwen 3.6 Locally · How to Run Qwen 3.5 Locally · AI Model Comparison · OpenRouter Complete Guide · Aider Complete Guide · Best Open Source Coding Models