Mar 23, 2026 · 3 min read

What Is Qwen 3.5? Alibaba's 397B Open-Source Model Explained

Qwen 3.5 is Alibaba Cloud’s flagship open-source AI model. It launched on February 16, 2026 — Chinese New Year’s Day — and immediately became one of the most capable open-weight models available. It has 397 billion total parameters, activates only 17 billion per forward pass, supports 201 languages, and it’s fully Apache 2.0 licensed.

If you’ve been tracking the open-source AI race, Qwen 3.5 is the model that made people seriously question whether you still need to pay for Claude or GPT.

What is Qwen 3.5?

Qwen 3.5 is a Mixture-of-Experts (MoE) vision-language model. Unlike traditional dense models that activate every parameter for every token, MoE models route each token through a subset of specialized “expert” networks. Qwen 3.5 has 397B total parameters but only activates 17B per token — giving you frontier-level intelligence at a fraction of the compute cost.

It’s natively multimodal, meaning text and vision were fused during training from the start, not bolted on afterward. It processes text, images, and video within one unified system.

The full model family

Qwen 3.5 isn’t a single model. It’s a family released in three waves:

Series	Models	Released
Flagship	Qwen3.5-397B-A17B (397B total, 17B active)	Feb 16, 2026
Medium	Qwen3.5-27B (dense), 35B-A3B, 122B-A10B	Feb 24, 2026
Small	Qwen3.5-0.8B, 2B, 4B, 9B	Mar 2, 2026

All models share the same architecture, support 201 languages, and ship under Apache 2.0. The vocabulary expanded from 150K to 250K tokens compared to Qwen3, improving encoding efficiency by 10–60% across most languages.

The small models are surprisingly capable. The 9B model matches or surpasses GPT-OSS-120B — a model 13x its size — on benchmarks like GPQA Diamond (81.7 vs 71.5) and HMMT (83.2 vs 76.7). The 35B-A3B runs on GPUs with as little as 8GB VRAM.

Key benchmarks

Here’s how the flagship 397B stacks up against frontier closed models:

Reasoning and math:

AIME 2026: 91.3 (GPT-5.2: 96.7, Claude 4.6: 93.3)
GPQA Diamond: 81.0 (GPT-5.2: 78.8)
IFBench (instruction following): 76.5 — highest of any model
MultiChallenge: 67.6 (GPT-5.2: 57.9, Claude 4.6: 54.2)

Coding:

SWE-bench Verified: 76.4 (Claude 4.6: 80.9, GPT-5.2: 80.0)
SWE-bench Multilingual: 72.0 (tied with GPT-5.2)
LiveCodeBench v6: 83.6

Vision and multimodal:

MathVision: 88.6 (GPT-5.2: 83.0, Gemini 3 Pro: 86.6)
OCRBench: 93.1
MMMU: 85.0

Qwen 3.5 leads on instruction following, multi-step challenges, and visual reasoning. It trails Claude Opus 4.6 on agentic coding tasks but beats it on multilingual and multimodal benchmarks.

Pricing

Alibaba Cloud’s Qwen3.5-Plus API costs approximately $0.11 per million input tokens. That’s roughly 13x cheaper than Claude Opus 4.6 via API. The hosted version includes a 1M context window and built-in tools like search and code interpreter.

It’s also available through Azure AI Foundry, NVIDIA NIM, and Hugging Face Inference Endpoints. Or you can self-host any model in the family for free.

Running it locally

The small models run on basically anything:

0.8B: 2GB RAM, any modern laptop
9B: 8GB RAM, runs on a 16GB laptop
35B-A3B: 8GB VRAM, runs on an M-series Mac
397B (Q4 quantized): ~214GB, needs a 256GB M3 Ultra or multi-GPU setup

# Easiest way — Ollama
ollama run qwen3.5:9b

# Or the flagship if you have the hardware
ollama run qwen3.5

Why it matters

Qwen has crossed 600 million downloads on Hugging Face, with over 170,000 derivative models. Over 40% of all new model derivatives on Hugging Face are now Qwen-based. AI Singapore chose Qwen over Meta’s Llama and Google’s Gemma as the foundation for its regional language model.

The gap between open-source and closed models is closing fast. Qwen 3.5 matches or beats GPT-5.2 on several benchmarks while being fully open and 13x cheaper. For developers who need multilingual support, visual understanding, or just want to avoid vendor lock-in, it’s the strongest option available.

What Is Qwen 3.5? Alibaba's 397B Open-Source Model Explained

What is Qwen 3.5?

The full model family

Key benchmarks

Pricing

Running it locally

Why it matters

Related

You might also like

Qwen 3.5 vs MiMo-V2-Flash — Open-Source AI Showdown (2026)

How to Use the Qwen 3.5 API — Setup Guide With Code Examples

Qwen 2.5 Coder vs Codestral — Best Open-Source Coding Model? (2026)

Qwen 2.5 Coder vs DeepSeek Coder — Open-Source Coding Models Compared (2026)