Alibaba ships Qwen 3.7 in two tiers: Max and Plus. They are not just different sizes of the same model. They are fundamentally different products optimized for different workloads. Max is the text-only flagship built for autonomous agents and coding. Plus is the multimodal variant with vision capabilities.
Picking the wrong tier means either overpaying for capabilities you do not need, or missing capabilities that your workload requires. This guide breaks down exactly what each tier offers and when to use it.
For the full picture on Qwen 3.7βs capabilities, see our Qwen 3.7 complete guide.
Quick comparison
| Feature | Qwen 3.7 Max | Qwen 3.7 Plus |
|---|---|---|
| Modality | Text-only | Text + Vision |
| Primary focus | Agentic coding, autonomous agents | Multimodal understanding |
| Context window | 1M tokens | 1M tokens |
| Max output | 65,536 tokens | 32,768 tokens |
| Input price | $2.50/1M tokens | $1.50/1M tokens |
| Output price | $7.50/1M tokens | $4.50/1M tokens |
| Image input | No | Yes |
| Cross-harness | Yes (Anthropic API) | Yes (Anthropic API) |
| Intelligence Index | 56.6 | 52.1 |
| Terminal-Bench Hard | 50.8% | 42.3% |
| Agent optimization | Maximum | Standard |
The short version: Max is for developers building autonomous agents and doing heavy coding work. Plus is for applications that need to understand images alongside text.
Qwen 3.7 Max: the text-only flagship
Max is Alibabaβs strongest model for pure text tasks. It was specifically optimized for long-running autonomous agent workflows:
- Intelligence Index: 56.6 (highest in the Qwen family)
- Terminal-Bench Hard: 50.8% (strong command-line agent performance)
- CritPt: 13.4% (complex multi-step reasoning)
- 35-hour continuous operation validated in benchmarks
- 1,158 tool calls in a single session without degradation
- 65,536 token max output for long code generation
Max excels at tasks that require sustained reasoning over long contexts: refactoring entire codebases, running multi-hour debugging sessions, executing complex infrastructure changes, and maintaining coherence across hundreds of tool calls.
The trade-off is clear: no vision capabilities whatsoever. If your workflow involves screenshots, diagrams, UI mockups, or any visual input, Max cannot help.
Best use cases for Max
- Autonomous coding agents (hours-long sessions)
- Repository-level code refactoring
- Infrastructure automation and DevOps
- Long-running CI/CD pipeline debugging
- Complex multi-file code generation
- Terminal-based agent workflows
Qwen 3.7 Plus: multimodal with vision
Plus adds image understanding to the Qwen 3.7 foundation. It can process screenshots, diagrams, charts, handwritten notes, UI designs, and any other visual input alongside text.
The trade-offs compared to Max:
- Lower text-only benchmarks (Intelligence Index 52.1 vs 56.6)
- Shorter max output (32,768 vs 65,536 tokens)
- Less agent optimization (not validated for ultra-long sessions)
- Lower pricing ($1.50/$4.50 vs $2.50/$7.50)
Plus is not a weaker model. It is a differently optimized model. The multimodal training gives it capabilities that Max simply does not have, at the cost of some text-only performance.
Vision capabilities
Qwen 3.7 Plus can:
- Analyze screenshots and identify UI elements
- Read and interpret charts, graphs, and diagrams
- Extract text from images (OCR)
- Understand code from screenshots
- Process architectural diagrams and flowcharts
- Analyze design mockups and provide implementation guidance
Best use cases for Plus
- Frontend development from design mockups
- Bug reports with screenshots
- Document analysis with embedded images
- Chart and data visualization interpretation
- UI/UX review and accessibility auditing
- Converting whiteboard diagrams to code
Pricing comparison
Plus is 40% cheaper than Max on both input and output tokens.
| Token type | Qwen 3.7 Max | Qwen 3.7 Plus | Savings with Plus |
|---|---|---|---|
| Input | $2.50/1M | $1.50/1M | 40% |
| Output | $7.50/1M | $4.50/1M | 40% |
For a typical coding session (200K input, 30K output):
- Max: $0.50 + $0.225 = $0.725
- Plus: $0.30 + $0.135 = $0.435
The savings add up over time, especially for high-volume workloads. If you do not need Maxβs superior agent capabilities or longer output window, Plus gives you solid performance at a lower price point.
Benchmark deep dive
| Benchmark | Qwen 3.7 Max | Qwen 3.7 Plus | Gap |
|---|---|---|---|
| Intelligence Index | 56.6 | 52.1 | -4.5 |
| Terminal-Bench Hard | 50.8% | 42.3% | -8.5% |
| CritPt | 13.4% | 10.8% | -2.6% |
| SWE-bench Verified | 81.2% | 77.4% | -3.8% |
| MMLU-Pro | 88.3% | 87.1% | -1.2% |
| GPQA Diamond | 89.7% | 87.9% | -1.8% |
The gap is largest on agent-specific benchmarks (Terminal-Bench Hard: 8.5% difference) and smallest on knowledge benchmarks (MMLU-Pro: 1.2% difference). This confirms that Maxβs advantage is specifically in autonomous agent workflows, not general intelligence.
For standard coding tasks that do not involve long autonomous sessions, Plus performs within a few percentage points of Max. The difference only becomes significant when you push into multi-hour agent territory.
API differences
Both tiers use the same API format and support the Anthropic API protocol for cross-harness compatibility. The main API-level differences:
from openai import OpenAI
client = OpenAI(
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
api_key="your-api-key",
)
# Qwen 3.7 Max (text-only)
response = client.chat.completions.create(
model="qwen-3.7-max",
messages=[
{"role": "user", "content": "Refactor this entire module..."}
],
max_tokens=65536,
)
# Qwen 3.7 Plus (with vision)
response = client.chat.completions.create(
model="qwen-3.7-plus",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Implement this UI from the mockup:"},
{"type": "image_url", "image_url": {"url": "https://example.com/mockup.png"}}
]
}
],
max_tokens=32768,
)
The model names and max token limits are the only differences in the API call. If you are not sending images, the calls are identical except for the model parameter.
For detailed API setup instructions, see our How to use Qwen 3.7 API guide.
Decision framework
Use this flowchart to pick your tier:
- Do you need to process images? Yes = Plus. No = continue.
- Are you building long-running autonomous agents (1+ hours)? Yes = Max. No = continue.
- Do you need >32K token outputs? Yes = Max. No = continue.
- Is cost a primary concern? Yes = Plus. No = Max.
If you answered βnoβ to all four questions, either tier works. Default to Plus for the lower cost unless you want the absolute best text performance.
Can you use both?
Yes. Many teams use both tiers for different parts of their workflow:
- Max for the autonomous coding agent that runs overnight
- Plus for the design-to-code pipeline that processes mockups
- Plus for general development tasks where the cost savings matter
- Max for complex debugging sessions that require long context and many tool calls
Since both tiers share the same API format and support the same protocols, switching between them is a one-line change (the model parameter).
FAQ
Is Qwen 3.7 Max better than Plus?
For text-only tasks, yes. Max scores higher on every text benchmark (Intelligence Index 56.6 vs 52.1, Terminal-Bench Hard 50.8% vs 42.3%). But Plus adds vision capabilities that Max does not have. βBetterβ depends entirely on whether you need image understanding.
Can Qwen 3.7 Plus handle coding tasks?
Yes. Plus scores 77.4% on SWE-bench Verified, which is competitive with most frontier models. It is only 3.8 percentage points behind Max on that benchmark. For standard coding tasks that do not involve ultra-long autonomous sessions, Plus performs well.
Why is Max more expensive than Plus?
Max is optimized for sustained autonomous operation with higher max output (65,536 vs 32,768 tokens) and better long-context coherence. The additional training and inference optimization for agent workloads justifies the premium. Plus trades some of that capability for multimodal support at a lower price.
Can Qwen 3.7 Max process images?
No. Max is text-only. If you need to process images, screenshots, diagrams, or any visual input, you must use Qwen 3.7 Plus.
Which tier works with Claude Code?
Both. Both Qwen 3.7 Max and Plus support the Anthropic API protocol for cross-harness compatibility. However, since Claude Code is a text-based terminal tool, Max is the better choice for that specific use case due to its superior agent performance.
Should I use Plus just because it is cheaper?
Not necessarily. If your workload involves long autonomous sessions (1+ hours), many sequential tool calls, or requires the 65K output window, Maxβs capabilities justify the 40% premium. If you are doing standard development tasks, code review, or shorter coding sessions, Plus offers excellent value.
Is there a free tier for either model?
Check OpenRouter for current availability. Alibaba occasionally offers preview pricing or free tiers for new models. For production use with SLAs, expect to pay the listed rates through the Aliyun BaiLian API.
Related: Qwen 3.7 Complete Guide Β· How to Use Qwen 3.7 API Β· AI Model Comparison Β· AI API Pricing Compared