🤖 AI Tools
· 12 min read

InclusionAI Ring 1T — The Thinking Model Behind Ling (2026)


InclusionAI does not put thinking and coding in the same model. While competitors like DeepSeek build thinking modes into their generation models, InclusionAI splits the job in two: Ring 1T handles reasoning and planning, Ling 2.6 handles code generation and execution. Two trillion-parameter models, each optimized for its role.

Ring 1T is the thinking half of this pair. It is a trillion-parameter Mixture-of-Experts model trained specifically for extended reasoning — the kind of step-by-step deliberation that competition math, complex debugging, and architectural planning require. It does not generate code directly. It thinks about problems, breaks them into steps, evaluates trade-offs, and produces structured plans that Ling 2.6 can execute.

This separation is unusual in 2026, where most AI labs are building unified models that both think and act. InclusionAI’s bet is that specialization produces better results than generalization — and the benchmarks suggest they may be right.

For the coding side of the pair, see what is InclusionAI Ling. For the full Ling 2.6 technical breakdown, see the InclusionAI Ling 2.6 complete guide.

What Ring 1T actually is

Ring 1T is a reasoning-first language model. Its architecture mirrors Ling 2.6 — trillion-parameter MoE with roughly 70B active parameters per token — but its training is fundamentally different.

Where Ling 2.6 was trained with code-specific reinforcement learning to optimize for code generation quality, Ring 1T was trained with reasoning-specific RL to optimize for:

  • Chain-of-thought quality — Producing coherent, step-by-step reasoning traces
  • Self-correction — Detecting and fixing errors in its own reasoning
  • Planning — Breaking complex problems into ordered sub-tasks
  • Evaluation — Assessing multiple approaches and selecting the best one
  • Mathematical reasoning — Solving competition-level math problems

Ring 1T does not try to be a good code generator. It tries to be a good thinker. The code generation is Ling’s job.

Specifications

SpecRing 1T
Total parameters~1T
Active parameters~70B
ArchitectureMoE (Transformer)
MoE experts128 total, 8 active
Context window128K tokens
Training frameworkAReaL (RL-based, reasoning-optimized)
Primary functionReasoning, planning, evaluation
LicenseApache 2.0
API availabilityInclusionAI API, OpenRouter
Release dateApril 2026

The AReaL framework

Both Ring 1T and Ling 2.6 are trained using InclusionAI’s AReaL (Automated Reinforcement Learning) framework. AReaL treats the entire model training process as a reinforcement learning problem, where the model learns to maximize a reward signal rather than simply predicting the next token.

For Ring 1T, the reward signal is reasoning quality:

  1. Correctness — Does the reasoning lead to the right answer?
  2. Coherence — Are the reasoning steps logically connected?
  3. Efficiency — Does the model reach the answer without unnecessary steps?
  4. Self-correction — Does the model detect and fix its own mistakes?
  5. Completeness — Does the reasoning cover all relevant cases?

The AReaL framework uses a learned reward model (itself a large language model) to evaluate Ring’s reasoning traces during training. This reward model was trained on human-annotated reasoning examples — mathematicians, programmers, and domain experts who rated reasoning traces for quality.

This is different from standard RLHF (Reinforcement Learning from Human Feedback), which typically optimizes for human preference on final outputs. AReaL optimizes for the quality of the reasoning process itself, not just the final answer. The theory is that a model that reasons well will produce good answers as a byproduct, while a model that produces good answers without good reasoning will fail on novel problems.

How Ring 1T relates to Ling 2.6

The Ring-Ling pair works like a think-then-execute pipeline:

User request → Ring 1T (think) → Structured plan → Ling 2.6 (execute) → Code output

Step 1: Ring thinks

Ring 1T receives the user’s request and produces a reasoning trace. For a coding task like “refactor this authentication module to support OAuth2,” Ring might produce:

  1. Analyze the current authentication module structure
  2. Identify the OAuth2 flow requirements (authorization code, token exchange, refresh)
  3. Design the new module interface (what functions to add, what to modify, what to remove)
  4. Plan the migration path (backward compatibility, deprecation of old endpoints)
  5. Identify test cases needed (happy path, token expiry, invalid grants, CSRF protection)

This is not code — it is a structured plan that describes what needs to happen and in what order.

Step 2: Ling executes

Ling 2.6 receives Ring’s plan along with the original code context and generates the actual code changes. Because Ling was trained specifically for code generation, it produces clean, idiomatic code that follows the plan’s structure.

Why separate models?

The separation has several advantages:

  • Specialization — Each model is optimized for its specific task. Ring does not waste capacity on code syntax; Ling does not waste capacity on extended reasoning.
  • Independent scaling — You can use Ring for planning and a smaller Ling variant (like Ling Flash) for execution, reducing compute costs.
  • Debugging — When something goes wrong, you can inspect Ring’s reasoning trace separately from Ling’s code output. This makes it easier to identify whether the problem was in the plan or the execution.
  • Flexibility — You can use Ring with non-Ling models, or use Ling without Ring for simple tasks that do not need extended reasoning.

The disadvantage is complexity. Two API calls instead of one. Two models to manage. Higher total latency because the thinking step adds time before code generation begins. For simple coding tasks, the Ring step is unnecessary overhead.

Ring 1T for coding agents

The Ring-Ling separation is particularly valuable for coding agents — automated systems that navigate codebases, fix bugs, and implement features with minimal human intervention.

The agent architecture

A coding agent built on Ring + Ling might work like this:

  1. User submits a bug report — “Login fails when the session cookie expires during a long-running request”
  2. Ring 1T analyzes the problem — Reasons about what could cause this, identifies likely code paths, plans an investigation strategy
  3. Ring produces a search plan — “Look for session validation in middleware, check cookie expiry handling, find the request timeout configuration”
  4. Ling 2.6 executes searches — Generates the actual search queries and reads the results
  5. Ring evaluates findings — Analyzes the code found, identifies the root cause, plans a fix
  6. Ling generates the fix — Writes the code patch based on Ring’s plan
  7. Ring reviews the fix — Checks the patch for correctness, completeness, and potential regressions
  8. Ling generates tests — Writes test cases based on Ring’s review

This alternating think-execute pattern is more robust than a single model trying to do everything. Ring’s reasoning catches issues that a pure code generation model would miss, and Ling’s code generation is cleaner because it follows a well-structured plan.

Comparison with single-model agents

Most coding agents in 2026 use a single model for both reasoning and code generation — Claude Opus 4.7, GPT-5.5, DeepSeek V4 Pro with thinking mode. These models are good at both tasks, but they make trade-offs:

  • Context competition — Reasoning tokens and code tokens compete for the same context window. Extended reasoning reduces the space available for code context.
  • Mode switching — The model must switch between reasoning mode and code generation mode, which can produce inconsistent outputs.
  • Optimization trade-offs — Training for reasoning quality can reduce code generation quality, and vice versa.

Ring + Ling avoids these trade-offs by giving each task its own model with its own context window. Ring can reason for thousands of tokens without reducing Ling’s code context. Each model is optimized for its specific task without compromise.

The cost is latency and complexity. A single-model agent produces output in one pass. Ring + Ling requires at least two passes, with the orchestration layer adding overhead.

Benchmark performance

Ring 1T’s benchmarks focus on reasoning tasks:

BenchmarkRing 1TNotes
MATH (competition)~90–94Near state-of-the-art
GSM8K (8-shot)~97–99Near ceiling
GPQA (graduate-level)~55–60Strong for open-source
ARC-Challenge~95–97Near ceiling
BBH (Big-Bench Hard)~88–92Strong multi-step reasoning
Planning benchmarks~75–80Above most open-source models

Ring 1T’s MATH score (~90–94) is significantly higher than Ling 2.6’s (~82–85), demonstrating the value of reasoning-specific training. On tasks that require extended chain-of-thought reasoning, Ring outperforms Ling by a wide margin.

On coding benchmarks (HumanEval, EvalPlus), Ring 1T performs worse than Ling 2.6 — it was not trained for code generation. This is by design. You would not use Ring to generate code directly; you would use it to plan and then hand off to Ling.

When to use Ring 1T

Use Ring when the task requires deliberation

  • Complex debugging — The bug is not obvious from the code. Ring can reason about execution paths, state mutations, and timing issues.
  • Architecture design — Evaluating trade-offs between approaches (microservices vs monolith, SQL vs NoSQL, sync vs async).
  • Migration planning — Planning a framework migration, database migration, or API version upgrade.
  • Code review — Deep review that goes beyond style and syntax to evaluate design decisions and potential failure modes.
  • Competition math — Ring’s ~90–94 MATH score makes it one of the strongest open-source reasoning models.

Skip Ring when the task is straightforward

  • Simple code generation — Writing a function, implementing an endpoint, creating a component. Ling handles these without needing a plan.
  • Code completion — Autocomplete and inline suggestions. Ring’s latency makes it unsuitable for real-time completion.
  • Formatting and style — Linting, formatting, and style fixes do not need reasoning.
  • Boilerplate — Generating standard patterns (CRUD endpoints, form components, test scaffolding).

The rule of thumb: if you can describe the task in one sentence and the solution is a single code block, skip Ring. If the task requires understanding context, evaluating options, or planning multiple steps, use Ring.

Running Ring 1T

API access

Ring 1T is available through InclusionAI’s API and OpenRouter. The API follows the standard chat completion format:

import openai

client = openai.OpenAI(
    base_url="https://api.inclusionai.com/v1",
    api_key="your-api-key"
)

# Step 1: Ring thinks
reasoning = client.chat.completions.create(
    model="ring-1t",
    messages=[{
        "role": "user",
        "content": "Plan a refactoring of this authentication module to support OAuth2: [code context]"
    }]
)

plan = reasoning.choices[0].message.content

# Step 2: Ling executes
code = client.chat.completions.create(
    model="ling-2.6",
    messages=[{
        "role": "system",
        "content": f"Execute this plan:\n{plan}"
    }, {
        "role": "user",
        "content": "[original code context]"
    }]
)

Local deployment

Ring 1T is a trillion-parameter model. It does not run on consumer hardware. You need:

  • Multi-GPU server (8× A100 80GB or equivalent)
  • Cloud instance with 640+ GB GPU memory
  • Specialized inference frameworks (vLLM, TensorRT-LLM)

For local use, there is no Ring equivalent of Ling Flash — no distilled small reasoning model from InclusionAI (yet). If you need local reasoning, consider other open-source reasoning models or use Ring through the API.

Ring vs other thinking models

Ring 1T vs DeepSeek V4 Pro (thinking mode)

DeepSeek V4 Pro integrates thinking into the same model that generates code. Ring 1T is a separate, specialized reasoning model. DeepSeek’s approach is more convenient (one model, one API call). Ring’s approach is more specialized (better reasoning quality, but requires orchestration).

On MATH benchmarks, Ring 1T (~90–94) outperforms DeepSeek V4 Pro’s thinking mode (~88–92). On practical coding tasks, the difference is smaller because DeepSeek’s integrated approach avoids the context-splitting overhead of Ring + Ling.

Ring 1T vs OpenAI o3

OpenAI’s o3 is a proprietary reasoning model with similar goals — extended chain-of-thought reasoning for complex problems. Ring 1T is open-weight (Apache 2.0), which means you can inspect, modify, and deploy it without restrictions. o3 is API-only with no weight access.

On reasoning benchmarks, o3 generally leads Ring 1T, but the gap has narrowed significantly. For developers who need open-weight reasoning models, Ring 1T is one of the strongest options available.

The future of separated reasoning

InclusionAI’s Ring + Ling approach represents one vision for the future of AI coding agents: specialized models coordinated by an orchestration layer, rather than monolithic models that try to do everything.

This approach has parallels in software engineering — microservices vs monoliths. A specialized reasoning model paired with a specialized code generation model can outperform a single model that does both, just as specialized microservices can outperform a monolithic application for certain workloads.

The trade-off is the same too: more complexity, more latency, more infrastructure to manage. Whether the quality improvement justifies the complexity depends on your use case.

For simple coding tasks, a single good model (Claude, GPT, DeepSeek) is simpler and faster. For complex agentic workflows where reasoning quality directly impacts success rate, Ring + Ling’s specialized approach may produce better results.

For more on how context windows affect these workflows, see AI context window explained.

FAQ

What is the difference between Ring 1T and Ling 2.6?

Ring 1T is a reasoning/thinking model — it produces step-by-step reasoning traces, plans, and evaluations. Ling 2.6 is a code generation model — it produces actual code. Both are trillion-parameter MoE models trained with the AReaL framework, but with different reward signals. Ring was trained to maximize reasoning quality (correctness, coherence, self-correction). Ling was trained to maximize code quality (correctness, idiomaticity, test coverage). They are designed to work together: Ring thinks, Ling executes.

Do I need Ring 1T to use Ling 2.6?

No. Ling 2.6 works perfectly well as a standalone code generation model. For simple coding tasks — writing functions, implementing endpoints, refactoring code — Ling alone is sufficient. Ring adds value for complex tasks that benefit from extended reasoning: debugging subtle bugs, planning large refactoring efforts, evaluating architectural trade-offs. Most developers will use Ling alone for everyday coding and add Ring only for complex tasks.

Can I run Ring 1T locally?

Not on consumer hardware. Ring 1T is a trillion-parameter model requiring 640+ GB of GPU memory for full-precision inference. You need a multi-GPU server (8× A100 80GB or equivalent) or a cloud instance. There is no distilled small version of Ring (unlike Ling Flash for Ling 2.6). For local reasoning, consider other open-source reasoning models or use Ring through the InclusionAI API or OpenRouter.

How does Ring 1T compare to DeepSeek V4 Pro’s thinking mode?

Ring 1T scores higher on pure reasoning benchmarks (MATH ~90–94 vs DeepSeek’s ~88–92 in thinking mode) because it is a specialized reasoning model. DeepSeek V4 Pro’s thinking mode is integrated into a general-purpose model, which is more convenient (one model, one API call) but less specialized. For complex reasoning tasks, Ring produces higher-quality reasoning traces. For practical coding workflows where convenience matters, DeepSeek’s integrated approach is simpler to use.

Is the Ring + Ling approach better than a single model?

For complex, multi-step tasks — yes, in terms of quality. The specialized models each excel at their role, and the separation avoids the trade-offs that unified models make between reasoning and generation. For simple tasks — no, the overhead of two API calls and orchestration is not justified. The break-even point depends on task complexity: if the task takes more than a few minutes of human thought to plan, Ring + Ling likely outperforms a single model. If you can describe the task in one sentence, a single model is faster and cheaper.

What is the AReaL framework?

AReaL (Automated Reinforcement Learning) is InclusionAI’s training framework that treats model training as a reinforcement learning problem. Instead of just predicting the next token (standard language model training), AReaL trains models to maximize a reward signal — reasoning quality for Ring, code quality for Ling. The reward signal comes from a learned reward model trained on human-annotated examples. AReaL also handles expert routing in the MoE architecture, training the router to assign tokens to the most relevant experts. This end-to-end RL approach is what differentiates InclusionAI’s models from standard SFT + RLHF pipelines.