For years, the AI industry operated under an unquestioned assumption: if you want to train a frontier model, you need NVIDIA GPUs. This was not just conventional wisdom — it was practical reality. Every major model from GPT-4 to DeepSeek V4 Pro to Claude Fable 5 trained on NVIDIA silicon. The CUDA ecosystem had no real alternative for large-scale AI training.
Then on June 12, 2026, Huawei released openPangu 2.0 — a 505 billion parameter Mixture-of-Experts model trained entirely on their Ascend 910B NPUs. Not a single NVIDIA chip involved. Not a hybrid approach. Not a small experiment. A full frontier-scale model with a 512K token context window, released open-source.
The answer to “can you train AI without NVIDIA?” is now definitively yes. The more interesting question is: what does this mean for the industry?
The NVIDIA dependency problem
NVIDIA controls approximately 80-90% of the AI training hardware market. This concentration creates several risks:
Supply constraints: Demand for H100 and B200 chips consistently exceeds supply. Wait times of 6-12 months for large orders are common. If you are not a hyperscaler with pre-negotiated allocations, getting NVIDIA hardware at scale is difficult.
Pricing power: With near-monopoly position, NVIDIA sets prices. H100 SXM cards cost $30,000-40,000 each. Cloud pricing reflects this. Competition would drive prices down, but alternatives were not viable at scale — until now.
Geopolitical risk: US export controls restrict NVIDIA chip sales to China, Russia, and other sanctioned markets. Even for non-sanctioned countries, dependency on a single US company’s hardware creates strategic vulnerability.
Single point of failure: If NVIDIA’s architecture has a flaw, if their supply chain is disrupted, if their pricing becomes prohibitive — the entire AI industry is affected. Concentration risk at this level is unhealthy for any industry.
For developers, this manifests as high cloud GPU costs, long wait times for hardware, and the nagging awareness that your entire technology stack depends on one company’s continued cooperation. Our guide to best cloud GPU providers 2026 shows the pricing landscape this monopoly creates.
Huawei’s proof: openPangu 2.0
openPangu 2.0 is the most significant challenge to NVIDIA’s training monopoly to date. Here is why:
Scale matters. People have trained small models on alternative hardware before. But a 505B parameter MoE model with 512K context? That requires solving distributed training at massive scale — thousands of accelerators coordinating with high-bandwidth interconnects. Huawei demonstrated this works on Ascend.
End-to-end on one platform. This is not a hybrid training run where some stages used NVIDIA. The entire training process — from random initialization to final weights — ran on Ascend 910B NPUs. No shortcuts.
Open-source release. By releasing the weights openly, Huawei invites scrutiny. The model works. People can run it, evaluate it, and verify that a non-NVIDIA training path produces competitive results.
Production-ready. The model is available via Huawei Cloud ModelArts for commercial use. This is not a research demo — it is a deployable system.
For the full technical breakdown of the model, see our openPangu 2.0 complete guide. For the hardware comparison, see Huawei Ascend vs NVIDIA.
Google TPUs: the other non-NVIDIA path
Huawei is not actually the first to train frontier models without NVIDIA. Google has been doing it for years — they just do not sell the hardware.
Google TPU training track record:
- PaLM (540B parameters, 2022): TPU v4
- Gemini 1.5 and later: TPU v5p
- Every major Google model since 2020: TPU-trained
The difference: Google TPUs are cloud-only and Google-exclusive. You cannot buy TPU hardware. You cannot run them in your own datacenter. They exist solely to serve Google Cloud customers and Google’s own models.
Huawei’s approach is different. Ascend hardware is available for purchase. openPangu’s weights are downloadable. The entire stack is accessible to organizations beyond just Huawei. This makes the sovereignty proposition real in a way that Google TPUs do not offer.
TPU vs Ascend for the industry:
- TPUs prove non-NVIDIA training works (been proving it since 2017)
- But TPUs do not solve the dependency problem — they just shift it from NVIDIA to Google
- Ascend offers actual hardware independence (you own the chips)
- openPangu offers actual model independence (you own the weights)
AMD: the CUDA-compatible alternative
AMD’s Instinct MI300X is the most directly competitive NVIDIA alternative for most developers:
- 192GB HBM3 per chip (more memory than any single NVIDIA chip)
- ROCm software stack (CUDA-compatible layer)
- Available on the open market without export restrictions
- Supported by PyTorch and major frameworks
But AMD has not proven frontier model training at openPangu scale. Companies use MI300X for inference and some training, but no 500B+ parameter model has been publicly trained exclusively on AMD hardware.
AMD’s advantage over Ascend for Western developers: ROCm provides CUDA source-level compatibility for most operations. Your existing CUDA code largely works on AMD with minimal changes. Ascend requires more adaptation via CANN or torch_npu.
AMD’s disadvantage: they do not control the full stack. They make GPUs but rely on TSMC for fabrication and community efforts for the software ecosystem. Huawei controls everything from chip design to framework to model.
Custom silicon: Groq, Cerebras, and others
The AI accelerator startup ecosystem is growing:
Cerebras:
- Wafer-scale engine (entire silicon wafer = one chip)
- Proven training up to moderate scale
- Unique architecture particularly suited to sparse models
- Limited availability
Groq:
- LPU architecture optimized for inference
- Extremely fast token generation
- Not primarily a training chip
- Growing cloud availability
SambaNova, Graphcore, Tenstorrent:
- Various approaches to AI acceleration
- Smaller scale deployments
- Growing ecosystems
None of these have demonstrated frontier-model training at openPangu’s scale. They serve important niches (inference acceleration, specialized workloads) but are not yet viable alternatives for training 500B+ parameter models.
What openPangu 2.0 means for NVIDIA
NVIDIA is not in trouble. Let us be clear about that. Their technology lead (B200, Rubin architecture), software moat (CUDA), and market position are strong. But openPangu 2.0 changes the narrative:
Before openPangu 2.0: “You need NVIDIA for frontier AI training. Period.”
After openPangu 2.0: “NVIDIA is the best option for frontier AI training. But alternatives exist.”
That shift from “only option” to “best option” matters enormously:
- It gives customers negotiating leverage on pricing
- It validates investment in alternative platforms
- It reduces the systemic risk of single-vendor dependency
- It opens doors for sovereign AI initiatives globally
For developers evaluating hardware choices, this expands your option set. Check our GPU vs CPU for AI inference guide for the inference side of the equation, and the NVIDIA RTX Spark guide for consumer-grade NVIDIA options.
The quality question
The skeptic’s response is predictable: “Sure, you can train on Ascend, but is the model any good?”
Fair question. openPangu 2.0 Pro has 18B active parameters — less than DeepSeek V4 Pro’s ~200B active. The model is unlikely to match peak quality of the largest NVIDIA-trained models.
But this misses the point. The question was never “can Ascend produce the single best model?” It was “can Ascend produce a model good enough to be useful?” And given that 18B-active MoE models are genuinely capable for a wide range of tasks, the answer is clearly yes.
Moreover, hardware improves. Ascend 910B is roughly A100-class. The upcoming 950DT targets H100 territory. Give it another generation and the quality gap that exists (if it exists) closes further.
The first airplane was not as good as a train. It still changed transportation forever.
Implications for sovereign AI strategies
Countries and organizations now have a concrete blueprint for NVIDIA-independent AI:
- Acquire Ascend hardware (available without US export restrictions in most markets)
- Deploy openPangu 2.0 as baseline frontier model (open-source, free weights)
- Fine-tune for local needs (language, domain, regulations)
- Build on Huawei Cloud or on-premise infrastructure
- Iterate independently of US technology supply chain
This is not theoretical. It is a practical path that organizations can execute today. See our sovereign AI models 2026 guide for the broader strategy.
For European organizations concerned about digital sovereignty and GDPR, an Ascend+openPangu stack running in EU datacenters provides a fully non-US AI infrastructure option. Our AI GDPR developers guide covers the compliance angle.
The training efficiency question
One important caveat: training on Ascend may require more chips and more time than equivalent NVIDIA hardware. Ascend 910B’s memory bandwidth (~1.6 TB/s) is lower than H100’s (3.35 TB/s). The interconnect bandwidth may be lower. This means:
- Training runs take longer (more epochs at same chip count, or same time with more chips)
- Larger clusters needed for equivalent throughput
- Higher total energy consumption for the same training run
- More engineering effort for optimization
Huawei can absorb this because they manufacture the chips. For third parties, the efficiency gap means higher total training cost compared to NVIDIA. But for inference (where most cost actually lives), the efficiency gap narrows because active parameter count determines throughput more than total chip performance.
Where we go from here
The AI hardware landscape in 2026 and beyond:
NVIDIA: Still dominant, still the default, still the best pure performance option. But facing actual competition for the first time at frontier training scale.
Huawei/Ascend: Proven at frontier scale. Growing ecosystem. Hardware-independent path validated. Will improve with each chip generation.
Google TPU: Proven for years but cloud-only and Google-exclusive. Not a solution for sovereign AI.
AMD: Strong inference option, growing training capability. ROCm compatibility makes migration easier. Needs a flagship training proof point at 500B+ scale.
Others: Maturing slowly. Niche applications today, potentially broader tomorrow.
The healthy outcome is a genuinely multi-vendor market where no single company can bottleneck the entire AI industry. openPangu 2.0 is a significant step toward that outcome.
For developers: what to do with this information
Immediate actions:
- Continue using NVIDIA if it is working for you — nothing changes overnight
- Try openPangu 2.0 via API (ModelArts) to evaluate quality for your use cases
- Consider Flash for high-volume workloads where 6B active is sufficient
Medium-term planning:
- Factor multi-vendor hardware into infrastructure roadmaps
- Evaluate whether sovereignty requirements apply to your organization
- Watch openPangu benchmark results as community evaluations arrive
- Consider Ascend hardware for new infrastructure investments if NVIDIA supply is constrained
Long-term thinking:
- The monopoly is cracking — plan for a multi-hardware world
- Open-source models on alternative hardware will keep improving
- Competition benefits everyone — expect lower prices and more options
- Build abstractions that do not lock you to one hardware vendor
For self-hosting considerations, see our guide on when to switch from API to self-hosted and the vLLM vs Ollama vs llama.cpp comparison.
FAQ
Did Huawei really train openPangu 2.0 with zero NVIDIA hardware?
Yes. Huawei has explicitly stated that the entire training process used their Ascend 910B NPUs. Given that Huawei is under US sanctions and cannot purchase NVIDIA hardware, and given that they manufacture their own AI chips, this claim is both technically plausible and strategically consistent.
Could other companies replicate this on Ascend hardware?
In principle, yes. Ascend hardware is available for purchase in non-sanctioned markets, and Huawei Cloud offers Ascend training clusters. However, replicating a 505B parameter training run requires significant expertise in distributed training on the CANN/MindSpore stack. It is not as turnkey as NVIDIA+PyTorch today.
Is this the end of NVIDIA’s dominance?
No. NVIDIA still has the best hardware, best software ecosystem, and broadest adoption. openPangu 2.0 proves alternatives work — it does not prove they are better or even equal. NVIDIA’s dominance is cracking but far from broken. The shift is from monopoly to oligopoly, not from dominance to irrelevance.
How does training efficiency on Ascend compare to NVIDIA?
Lower per-chip, compensated by cluster scale. Ascend 910B is roughly A100-equivalent, meaning it takes approximately 3x more Ascend chips to match H100 throughput for the same training run. Huawei compensates with larger clusters and vertical optimization of the entire stack.
Will this affect GPU prices?
Eventually, yes. Competition drives prices down. If Ascend becomes a credible alternative for training, NVIDIA faces pricing pressure. This benefits everyone — even developers who continue using NVIDIA hardware. The effect will be gradual rather than immediate.
Should I learn the Ascend/CANN ecosystem?
If you work in China, with Chinese enterprises, or in sovereign AI markets — yes, it is increasingly relevant. If you are a Western developer with reliable NVIDIA access and no sovereignty requirements — it is interesting to monitor but not urgent to learn. The torch_npu bridge means much of your PyTorch knowledge transfers.