May 8, 2026 · 5 min read

Last updated on Apr 19, 2026

What is Falcon? TII's Open-Source AI Model from the UAE

Falcon is a family of open-source language models built by the Technology Innovation Institute (TII) in Abu Dhabi, UAE. It was one of the first non-US models to compete with GPT-3.5 on benchmarks and has evolved into a multi-model family covering text, code, vision, and reasoning.

The Falcon model family

Model	Parameters	Type	Best for	License
Falcon 2 11B	11B	Text	General purpose, 11 languages, 5T tokens trained	Apache 2.0
Falcon H1R 7B	7B	Hybrid (SSM + attention)	Reasoning, math, coding	Apache 2.0
Falcon Perception	600M	Vision	Object detection, segmentation	Apache 2.0
Falcon OCR	300M	Vision	Text extraction from images	Apache 2.0
Falcon 40B	40B	Text	High-quality generation	Apache 2.0
Falcon 180B	180B	Text	Frontier quality (needs GPU cluster)	Custom

Falcon H1R 7B: the new star

The latest Falcon release is H1R-7B, a hybrid model combining State Space Models (SSM) with traditional attention. At just 7B parameters, it outperforms models up to 47B parameters. It scored 88.1% on AIME-24 (math), beating Microsoft Phi-4 14B, Alibaba Qwen3 32B, and NVIDIA Nemotron 47B. It processes up to 1,500 tokens per second per GPU.

Why it matters: Most small models (7-9B) are mediocre at reasoning. Falcon H1R-7B proves that architecture innovation (hybrid SSM + attention) can beat raw parameter count. It’s a direct competitor to Qwen3 8B and Yi-Coder 9B.

Falcon vs other open models

Model	Params	Reasoning	Coding	Multilingual	License
Falcon H1R 7B	7B	✅ 88.1% AIME-24	Good	Good	Apache 2.0
Falcon 2 11B	11B	Good	Good	✅ Strong	Apache 2.0
Qwen3 8B	8B	Good	Good	✅ Strong	Apache 2.0
Yi-Coder 9B	9B	Decent	✅ Strong	Good	Apache 2.0
DeepSeek R1 14B	14B	✅ Best	Good	Good	MIT
Gemma 4 9B	9B	Good	Good	Good	Custom

Falcon H1R-7B’s hybrid architecture gives it a reasoning edge over other 7-9B models. For pure coding, Yi-Coder 9B is still better. For deep reasoning, DeepSeek R1 14B wins but needs more RAM.

The UAE AI ecosystem

Falcon is part of a broader UAE investment in AI sovereignty:

Falcon (TII) — general-purpose open models
Jais (G42/MBZUAI) — Arabic-specialized models
G42 — AI infrastructure and cloud
MBZUAI — AI research university

Together, the UAE has invested billions in building an independent AI ecosystem. For developers, this means more high-quality open models with permissive licenses.

How to run Falcon locally

# Install Ollama
brew install ollama

# Falcon 2 (11B, general purpose)
ollama pull falcon2

# Falcon 40B (needs 32GB+ RAM)
ollama pull falcon:40b

# Test
ollama run falcon2 "Explain microservices architecture"

Hardware requirements

Model	RAM needed	Performance
Falcon H1R 7B	6 GB	~25 tok/s on M2
Falcon 2 11B	8 GB	~20 tok/s on M2
Falcon 40B	32 GB	~10 tok/s on M3 Pro
Falcon 180B	128 GB+	Needs GPU cluster

With coding tools

# Aider
aider --model ollama/falcon2

# Continue.dev - add to config.json
# { "models": [{ "provider": "ollama", "model": "falcon2" }] }

Who should use Falcon

Multilingual projects — Falcon 2 was trained on diverse multilingual data across 11 languages
Reasoning tasks on budget hardware — Falcon H1R-7B at 7B beats models up to 47B including Microsoft Phi-4 Reasoning Plus 14B, Alibaba Qwen3 32B, and NVIDIA Nemotron H 47B
UAE/Middle East deployment — local ecosystem, cultural alignment
Apache 2.0 needed — fully commercial, no restrictions

For coding specifically, Yi-Coder 9B or Qwen3 8B are better choices. Falcon’s strength is reasoning and multilingual capability.

The Falcon H1 architecture explained

What makes Falcon H1R special is its hybrid architecture. Traditional transformers use attention mechanisms that scale quadratically with sequence length — doubling the context doubles the compute by 4x. Falcon H1R combines:

Transformer attention layers — precise token-level reasoning, good at understanding relationships between specific words/tokens.

Mamba (State Space Model) layers — efficient sequential processing with linear scaling. The Mamba component processes sequences in constant memory per token, regardless of length.

The result:

256K context window — 32x larger than standard Falcon 2’s 8K
1,500 tokens/second per GPU at batch size 64 — nearly 2x the throughput of Qwen3-8B
88.1% on AIME-24 — a math benchmark where it beats models with 7x more parameters
Linear memory scaling — the 200,000th token costs the same to process as the 1st

This hybrid approach is similar to what Qwen 3.6 Plus does at a much larger scale (hybrid linear attention + MoE). Falcon H1R proves the concept works at 7B parameters too.

Falcon’s evolution

Version	Year	Parameters	Key achievement
Falcon 7B/40B	2023	7B/40B	First UAE open model, topped HuggingFace leaderboard
Falcon 180B	2023	180B	Largest open model at the time
Falcon 2 11B	2024	11B	5T tokens, 11 languages, VLM variant
Falcon Mamba 7B	2024	7B	First pure Mamba model from TII
Falcon H1R 7B	2026	7B	Hybrid architecture, beats 47B models
Falcon Perception	2026	600M	Vision model for object detection
Falcon OCR	2026	300M	Text extraction from images

TII has consistently pushed boundaries: first to release a 180B open model, first to release a production Mamba model, and now first to demonstrate hybrid SSM+attention beating models 7x larger.

FAQ

Which Falcon model should I use for coding?

For coding tasks, Falcon H1R-7B is the best choice in the Falcon family due to its strong reasoning capabilities. However, for pure coding quality, Yi-Coder 9B or Qwen3 8B are better specialized alternatives. Falcon’s real strength is reasoning and multilingual tasks rather than code generation specifically.

How does Falcon H1R-7B beat models 7x its size?

The hybrid SSM + attention architecture is the key. Traditional transformers scale quadratically with sequence length, but Falcon H1R combines Mamba (State Space Model) layers for efficient sequential processing with transformer attention layers for precise reasoning. This architectural innovation lets it achieve 88.1% on AIME-24 math benchmarks, beating models up to 47B parameters.

Both are UAE-funded AI models but from different organizations. Falcon is built by the Technology Innovation Institute (TII) in Abu Dhabi and focuses on general-purpose multilingual AI. Jais is built by G42/MBZUAI and specializes in Arabic language. They’re complementary parts of the UAE’s broader AI sovereignty strategy.

What is Falcon? TII's Open-Source AI Model from the UAE

The Falcon model family

Falcon H1R 7B: the new star

Falcon vs other open models

The UAE AI ecosystem

How to run Falcon locally

Hardware requirements

With coding tools

Who should use Falcon

The Falcon H1 architecture explained

Falcon’s evolution

FAQ

Which Falcon model should I use for coding?

How does Falcon H1R-7B beat models 7x its size?

Is Falcon related to Jais?

📬 AI Dev Weekly

You might also like

What is Jais? The UAE's Open-Source Arabic AI Model

What is Yi? 01.AI's Open-Source Model Family Explained

What is MiMo-V2-Flash? Xiaomi's Open-Source Speed Demon Explained

What Is Qwen 3.5? Alibaba's 397B Open-Source Model Explained