openPangu 2.0 Complete Guide: Huawei's 505B Model Trained Without NVIDIA (2026)
Huawei just released the first frontier-scale AI model trained entirely without NVIDIA hardware. openPangu 2.0 dropped today at HDC 2026, announced by Yu Chengdong himself, and it changes the conversation about what is possible outside the NVIDIA ecosystem.
Two versions. Both open-source. Both trained on Huawei’s own Ascend NPUs. Both offering a 512K token context window. The Pro variant packs 505 billion total parameters with 18 billion activated per token via Mixture-of-Experts. The Flash variant comes in at 92 billion total with 6 billion active. Neither touched an A100 or H100 during training.
This is the guide that covers everything: architecture, access, what we know about performance, and why this matters beyond just another model release.
Why openPangu 2.0 matters
Every other frontier model you have used was trained on NVIDIA GPUs. DeepSeek V4 Pro — NVIDIA. Qwen 3.7 — NVIDIA. Kimi K2.7 — NVIDIA. GPT-5, Claude Fable 5, Gemini — all NVIDIA.
openPangu 2.0 is the first model at this scale trained on non-NVIDIA silicon. Huawei built it on their Ascend 910B NPUs, the same chips that US sanctions tried to prevent from existing. The fact that a 505B parameter model came out of this hardware stack is not just a technical achievement. It is a geopolitical statement.
For developers, the immediate implications are practical. If you operate in regions where NVIDIA supply is restricted, or if you are building sovereign AI infrastructure, there is now a frontier model that proves the alternative hardware path works.
Architecture breakdown
Both openPangu 2.0 versions use Mixture-of-Experts (MoE) architecture. This means only a fraction of the total parameters activate for each token, keeping inference costs manageable despite the large total parameter count.
openPangu 2.0 Pro:
- 505B total parameters
- 18B activated per token
- 512K token context window
- MoE routing with sparse activation
openPangu 2.0 Flash:
- 92B total parameters
- 6B activated per token
- 512K token context window
- MoE routing with sparse activation
The MoE approach is the same strategy that made DeepSeek and Mixtral efficient. You get the knowledge capacity of a massive model but the inference cost of a much smaller one. With 18B active parameters, Pro sits in a compute bracket similar to running an 18B dense model per forward pass. Flash at 6B active is remarkably lightweight — comparable to running a 7B dense model.
Both versions share the 512K token context window. That is competitive with the longest context windows available today and puts openPangu in territory where you can process entire codebases, long legal documents, or extended conversation histories without chunking.
Trained on Ascend NPUs — the technical story
Huawei has been under US chip sanctions since 2020. They could not buy NVIDIA A100s, H100s, or any high-end training GPUs. Instead of abandoning large-scale AI training, they built their own hardware.
The Ascend 910B is Huawei’s current flagship AI training chip. It is manufactured on a mature process node (7nm via SMIC) and designed specifically for the matrix operations that dominate transformer training. The upcoming Ascend 950DT is expected to close the performance gap further.
Training a 505B parameter model requires massive parallelism — thousands of accelerators working together with high-bandwidth interconnects. Huawei had to build not just the chips but the entire training infrastructure: the interconnects, the software stack (MindSpore/CANN), the cluster management, and the optimization tooling.
The fact that they pulled this off does not mean Ascend matches NVIDIA in raw FLOPS per dollar. It means the gap is narrow enough that with enough engineering effort, you can train frontier models without NVIDIA. For a deep dive into the hardware comparison, see our Huawei Ascend vs NVIDIA analysis.
Licensing and open-source terms
openPangu 2.0 ships under the Huawei openPangu License. The key terms:
- Permissive: You can use it for commercial purposes
- Royalty-free: No per-token fees or revenue sharing
- Non-exclusive: Does not restrict your use of other models
This puts it in similar territory to MIT-licensed models like DeepSeek V4 Pro, though the specific license text may have nuances around export compliance and usage restrictions worth reading carefully for your jurisdiction.
The weights are available for download, meaning you can self-host the model if you have appropriate hardware.
How to access openPangu 2.0
There are two primary paths:
Huawei Cloud ModelArts (easiest): Huawei Cloud’s ModelArts platform offers hosted inference. You sign up for a Huawei Cloud account, navigate to the ModelArts AI Gallery, and access openPangu 2.0 through their API. This is the path of least resistance — no hardware required.
Self-hosted (for sovereignty or customization): Download the weights and run them on your own infrastructure. The Pro model will need significant compute — likely 4+ Ascend 910B NPUs or equivalent. Flash with 6B active parameters could potentially run on more modest hardware, making it the more accessible option for local deployment. Check our how to run openPangu 2.0 locally guide for specifics.
Performance expectations
As of launch day, Huawei has shared limited public benchmark numbers. Based on the architecture (505B total, 18B active MoE with 512K context) and the caliber of Huawei’s research team, we can set reasonable expectations:
- Pro should compete with models in the 70B-dense class for quality, given 18B active parameters with access to a much larger expert pool
- Flash at 6B active will likely trade some quality for speed, targeting use cases where latency and cost matter more than peak accuracy
- 512K context puts both versions in the top tier for long-context tasks
- Coding and reasoning capabilities will be the areas to watch — these are where models like DeepSeek V4 Pro and Claude Fable 5 have set the bar
We will update this guide as independent benchmarks become available. The model just launched today, so community evaluations will take a few days to materialize.
How openPangu 2.0 compares to other frontier models
A quick positioning overview:
| Model | Total Params | Active Params | Context | License | Training Hardware |
|---|---|---|---|---|---|
| openPangu 2.0 Pro | 505B | 18B | 512K | openPangu (permissive) | Ascend NPU |
| openPangu 2.0 Flash | 92B | 6B | 512K | openPangu (permissive) | Ascend NPU |
| DeepSeek V4 Pro | 1.6T | ~200B | 128K | MIT | NVIDIA |
| Qwen 3.7 Max | ~400B+ | varies | 128K | Apache 2.0 | NVIDIA |
| Kimi K2.7 | 1T | 32B | 128K | Modified MIT | NVIDIA |
| Claude Fable 5 | undisclosed | undisclosed | 200K | Closed | NVIDIA |
The pattern is obvious. Every other model in that table runs on NVIDIA. openPangu 2.0 is the outlier, and that is exactly why it matters.
For detailed comparisons, see our head-to-head articles: openPangu 2.0 vs DeepSeek V4 Pro, openPangu 2.0 vs Qwen 3.7, and openPangu 2.0 vs Claude Fable 5.
Use cases and target audience
openPangu 2.0 is positioned for several key scenarios:
Sovereign AI deployments: Governments and organizations in sanctioned regions (or those concerned about NVIDIA supply chain risk) now have a frontier model option that does not depend on US hardware. See our sovereign AI models guide for context.
HarmonyOS ecosystem: Huawei is integrating openPangu into their HarmonyOS platform. If you are developing for HarmonyOS devices, this is the native AI backbone.
Cost-sensitive inference: Flash at 6B active parameters means cheap inference for high-volume applications. The MoE design keeps costs down while retaining access to a large knowledge base.
Long-context applications: 512K tokens is enough to process entire repositories, lengthy legal documents, or multi-hour conversation histories in a single pass.
The HarmonyOS connection
openPangu 2.0 is not just a standalone model release. It is part of Huawei’s broader ecosystem strategy. HarmonyOS — Huawei’s operating system for phones, tablets, IoT devices, and cars — gets an AI backbone that is fully Huawei-controlled from silicon to software.
This vertical integration (chips → training infrastructure → model → OS → devices) is unique in the industry. Apple has something similar with their on-device models, but at nowhere near the parameter scale. Huawei is building the only fully non-US AI stack from hardware to application layer.
What this means for developers
If you are building AI applications and currently depend on NVIDIA-trained models via API, openPangu 2.0 does not necessarily change your immediate workflow. The models you are using today are likely still better optimized for most tasks.
But here is what does matter:
-
Supply chain risk is reduced. Even if you never run Pangu yourself, its existence means NVIDIA’s monopoly on training hardware is no longer absolute. This benefits everyone through competition.
-
Sovereign options exist. If regulations or data residency requirements push you toward non-US infrastructure, there is now a credible frontier model available on non-US hardware.
-
The Flash model is practical. At 6B active parameters, Flash is small enough to explore for production workloads where you need a permissively-licensed model with a massive context window.
-
Ascend ecosystem is growing. If you are evaluating alternative hardware for AI inference, the Ascend software stack just got a major validation.
FAQ
Is openPangu 2.0 really trained entirely without NVIDIA GPUs?
Yes. Huawei has confirmed that the entire training process used their Ascend 910B NPUs. No NVIDIA hardware was involved in training. This is the first frontier-scale model to achieve this.
Can I run openPangu 2.0 on NVIDIA GPUs for inference?
The model weights are format-agnostic once exported. While native support targets Ascend NPUs via the MindSpore/CANN stack, community efforts to run inference on NVIDIA GPUs (via conversion to standard formats) are expected. Flash at 6B active is the most likely candidate for cross-platform inference.
How does the openPangu license compare to MIT?
The openPangu license is permissive, royalty-free, and non-exclusive — similar in spirit to MIT. The specific differences relate to attribution requirements and potential usage restrictions. Read the full license text before deploying in production.
Is openPangu 2.0 better than DeepSeek V4 Pro?
Different tools for different contexts. DeepSeek V4 Pro has 200B active parameters vs Pangu Pro’s 18B, so raw capability likely favors DeepSeek for complex reasoning and coding tasks. Pangu’s advantages are sovereignty (no NVIDIA dependency), the 512K context window, and the Flash variant’s efficiency. See our detailed comparison.
What hardware do I need to run openPangu 2.0 locally?
Pro (505B total) needs multi-accelerator setups — think 4+ high-end GPUs or NPUs with 80GB+ each. Flash (92B total, 6B active) is much more accessible and might run on a single high-end consumer GPU depending on quantization support. Check our local setup guide for specifics.
Will Huawei keep updating openPangu?
Based on the versioning (2.0 follows earlier 1B and 7B releases) and Huawei’s stated AI strategy, continued development is expected. The model is central to their HarmonyOS and cloud platform strategy, making it a long-term investment rather than a one-off release.