🤖 AI Tools
· 5 min read
Last updated on

Qwen 3.6 Plus: Free 1M Context Model That Beats GPT-5 on Coding (2026)


May 2026 Update: Qwen 3.7 Max is now available with a 56.6 Intelligence Index score and 1M context. See our Qwen 3.7 Complete Guide for the latest.

Qwen 3.6 Plus is Alibaba’s latest flagship model, released March 30, 2026. It features a 1M token context window, hybrid linear attention + MoE architecture, and scores 78.8% on SWE-bench Verified. It’s currently free on OpenRouter.

Update (April 27, 2026): The Qwen 3.6 family now includes Flash (speed-optimized, $0.25/1M input) and Max Preview (new flagship, tops 6 coding benchmarks, AA Index 52).

Update (April 23, 2026): Alibaba released Qwen 3.6-27B, a 27B dense model that scores 77.2% on SWE-bench Verified, beating the 397B flagship. Runs on a Mac with 22GB VRAM. See our 27B complete guide.

⚡ Update (April 17, 2026): Alibaba has released Qwen 3.6-35B-A3B — the first open-weight model in the 3.6 generation. It’s a 35B MoE with only 3B active parameters, scores 73.4% on SWE-bench Verified, runs on a laptop (~21 GB quantized), and is Apache 2.0 licensed. If you want to run Qwen 3.6 locally, that’s the one to use.

Key specs

SpecValue
DeveloperAlibaba (Tongyi Lab)
Release dateMarch 30, 2026
ArchitectureHybrid linear attention + sparse MoE
Context window1M tokens (256K native, extended via YaRN)
Max output65,536 tokens
Chain-of-thoughtAlways-on
SWE-bench Verified78.8%
Terminal-Bench 2.061.6%
MCPMark48.2%
Price (OpenRouter)Free (preview)
Price (Aliyun)Standard Qwen Plus pricing
Open weightsNot yet (API-only)
LicenseProprietary (API access)

What it’s good at

Qwen 3.6 Plus was specifically optimized for agentic coding workflows:

  • Repository-level coding — understands entire codebases, not just single files
  • Front-end generation — HTML/CSS/JS from natural language descriptions
  • Code repair — finds and fixes bugs across multiple files
  • Terminal automation — executes commands and interprets output
  • Tool calling — reliable MCP and function calling
  • Long document analysis — 1M context handles entire books, transcripts, or codebases

What it’s not good at

  • Not available locally — no Ollama or GGUF downloads yet
  • Not a chat model — optimized for technical tasks, not conversational AI
  • Peak hour limitations — Aliyun API may have rate limits during high demand
  • Multimodal — text-only for now (Qwen 3.5 Omni handles vision/audio)

How to use it

OpenRouter (free, easiest)

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="your-openrouter-key",
)

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[
        {"role": "system", "content": "You are a senior software engineer."},
        {"role": "user", "content": "Review this code for security issues:\n\n```python\n...```"}
    ],
    max_tokens=65536,
)
print(response.choices[0].message.content)

Sign up at openrouter.ai for a free API key.

Aliyun BaiLian API (production)

For production use with SLAs and guaranteed rate limits:

from openai import OpenAI

client = OpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key="your-aliyun-key",
)

response = client.chat.completions.create(
    model="qwen-plus-2026-0330",
    messages=[{"role": "user", "content": "..."}],
)

With Aider

# Via OpenRouter
aider --model openrouter/qwen/qwen3.6-plus:free

# Via Aliyun
export OPENAI_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1
export OPENAI_API_KEY=your-aliyun-key
aider --model qwen-plus-2026-0330

See our Aider Complete Guide for full setup.

With Continue.dev

In your Continue config, add:

{
  "models": [{
    "provider": "openai",
    "model": "qwen/qwen3.6-plus:free",
    "apiBase": "https://openrouter.ai/api/v1",
    "apiKey": "your-openrouter-key"
  }]
}

See our Continue.dev Guide for full setup.

Benchmarks compared

ModelSWE-benchTerminal-BenchMCPMarkContextPrice
Qwen 3.6 Plus78.8%61.6%48.2%1MFree*
Claude Opus 4.5~80%59.3%~50%200K$15/$75
Claude Sonnet 4.6~75%~55%~45%200K$3/$15
GPT-5~72%~52%~40%128K$5/$15
DeepSeek R1~70%~48%~35%128K$0.55/$2.19
Gemini 2.5 Pro~73%~50%~42%1M$1.25/$10

*Free on OpenRouter preview. Production pricing via Aliyun.

Qwen 3.6 Plus is the only model that beats Claude Opus 4.5 on Terminal-Bench while being free. The SWE-bench score of 78.8% puts it in the top tier alongside Claude.

The preserve_thinking parameter

Qwen 3.6 Plus introduces a preserve_thinking parameter for agent workflows. When enabled, the model’s chain-of-thought reasoning is included in the response, letting you see why it made specific decisions:

response = client.chat.completions.create(
    model="qwen/qwen3.6-plus:free",
    messages=[{"role": "user", "content": "Debug this error..."}],
    extra_body={"preserve_thinking": True},
)

This is useful for debugging agent loops and understanding why the model chose a specific approach.

Architecture: hybrid linear attention + MoE

The key innovation in Qwen 3.6 Plus is combining two efficiency techniques:

  1. Linear attention — reduces the quadratic cost of standard attention to linear, enabling the 1M context window without proportional memory increase
  2. Sparse MoE — only activates a subset of parameters per token, keeping inference fast despite the large total parameter count

The result is a model that’s roughly 3x faster than Claude Opus 4.6 in community benchmarks while maintaining competitive quality.

Pricing

ProviderInputOutputFree tier
OpenRouterFree (preview)Free (preview)Yes
Aliyun BaiLianStandard Qwen Plus ratesStandard ratesTrial credits

The free OpenRouter preview won’t last forever. For production use, set up the Aliyun API now so you’re ready when the preview ends.

vs Qwen 3.5

See our detailed Qwen 3.6 vs 3.5 comparison for the full breakdown. The short version: 3.6 Plus is better at everything, but it’s API-only. If you need to run locally, Qwen 3.5 is still your best option.

FAQ

Is Qwen 3.6 Plus free?

Yes, Qwen 3.6 Plus is currently free on OpenRouter during the preview period. For production use with SLAs, you’ll need the Aliyun BaiLian API which uses standard Qwen Plus pricing.

How does Qwen 3.6 compare to Claude?

Qwen 3.6 Plus scores 78.8% on SWE-bench Verified, close to Claude Opus 4.5 (~80%) and above Claude Sonnet 4.6 (~75%). It beats Claude Opus 4.5 on Terminal-Bench (61.6% vs 59.3%) and offers a 1M token context window vs Claude’s 200K. The biggest advantage is price — it’s free on OpenRouter. See our AI model comparison for a full breakdown.

Can I use Qwen 3.6 with Aider?

Yes. Use aider --model openrouter/qwen/qwen3.6-plus:free via OpenRouter, or point Aider at the Aliyun API. See the setup instructions above or our Aider Complete Guide.

What’s the context window of Qwen 3.6?

Qwen 3.6 Plus has a 1M token context window (256K native, extended via YaRN) with a max output of 65,536 tokens. If you want to run Qwen 3.6 locally, the open-weight Qwen 3.6-35B-A3B supports 128K context.

Related: Qwen 3.6 vs 3.5 — What Changed · How to Run Qwen 3.6 Locally · How to Run Qwen 3.5 Locally · AI Model Comparison · OpenRouter Complete Guide · Aider Complete Guide · Best Open Source Coding Models