Appleās Language Model Protocol (LMP) is the most significant shift in how iOS developers work with AI since Core ML. Announced at WWDC 2026 (Session 339), itās an open protocol that lets any LLM provider ā Claude, Gemini, custom models, or Appleās own Foundation Models ā plug into a single Swift API. No more juggling five different SDKs. No more rewriting your networking layer every time you switch providers.
If youāve been building AI features on iOS, you know the pain. Each provider has its own SDK, its own auth flow, its own streaming format, its own way of handling tool calls. LMP eliminates all of that by defining a standard interface that any model can conform to.
Letās break down what it is, how it works, and how to implement it today.
What Is the Language Model Protocol?
The Language Model Protocol is Appleās answer to a fragmented AI ecosystem. Itās a Swift protocol ā specifically LanguageModelExecutor ā that defines how any language model communicates with your app. Whether the model runs on-device (Appleās Foundation Models), in Appleās Private Cloud Compute, or on a third-party server (Anthropic, Google, OpenAI, or your own infrastructure), the API surface is identical.
Think of it this way: Core ML standardized how you run inference on-device. LMP standardizes how you talk to any language model, anywhere.
The protocol supports:
- Text generation with streaming responses
- Tool calling (function calling) with structured schemas
- Image input for multimodal models
- Structured output with type-safe decoding
- Conversation history management
Apple has committed to open-sourcing the framework later in 2026, which means third-party providers can ship conforming implementations without reverse-engineering anything.
How It Works: The Architecture
The architecture is elegant in its simplicity. Your app talks to LanguageModelExecutor. The executor is a protocol that any provider can conform to. Apple ships three built-in executors:
- On-device executor ā Runs Appleās Foundation Models directly on the Neural Engine
- PCC executor ā Routes to Appleās Private Cloud Compute for larger models
- Third-party executor ā Your custom implementation for any external provider
Hereās the key insight: your application code doesnāt change regardless of which executor is active. You can swap from an on-device model to Claude to Gemini by changing one line of configuration.
This is fundamentally different from how MCP (Model Context Protocol) works. MCP standardizes how models connect to tools and data sources. LMP standardizes how your app connects to models. Theyāre complementary ā you can use both together. Your app uses LMP to talk to the model, and the model uses MCP to talk to external tools.
Implementing LanguageModelExecutor
Hereās what a basic third-party executor looks like. Letās say you want to integrate Claude:
import FoundationModels
struct ClaudeExecutor: LanguageModelExecutor {
let apiKey: String
let model: String = "claude-sonnet-4-20250514"
func generate(
prompt: LanguageModelPrompt,
options: GenerationOptions
) async throws -> LanguageModelResponse {
let request = buildRequest(from: prompt, options: options)
let (data, response) = try await URLSession.shared.data(for: request)
return try decodeResponse(data)
}
func stream(
prompt: LanguageModelPrompt,
options: GenerationOptions
) -> AsyncThrowingStream<LanguageModelChunk, Error> {
AsyncThrowingStream { continuation in
Task {
let request = buildStreamRequest(from: prompt, options: options)
let (bytes, _) = try await URLSession.shared.bytes(for: request)
for try await line in bytes.lines {
if let chunk = parseSSELine(line) {
continuation.yield(chunk)
}
}
continuation.finish()
}
}
}
}
The beauty is in the consumption side. Your app code looks like this regardless of the provider:
let session = LanguageModelSession(executor: ClaudeExecutor(apiKey: key))
// Simple generation
let response = try await session.generate("Explain this Swift error")
// Streaming
for try await chunk in session.stream("Write a unit test for this function") {
outputView.append(chunk.text)
}
Switch to Appleās on-device model? Change one line:
let session = LanguageModelSession(executor: .onDevice)
Tool Calling With LMP
Tool calling is where LMP really shines. Instead of each provider having its own function-calling format, you define tools once using Swiftās type system:
@Tool
struct SearchDocumentation {
@Parameter(description: "The search query")
var query: String
@Parameter(description: "Maximum results to return")
var limit: Int = 5
func execute() async throws -> [DocumentResult] {
// Your implementation
return try await docSearch.search(query, limit: limit)
}
}
let session = LanguageModelSession(
executor: executor,
tools: [SearchDocumentation.self]
)
The @Tool macro generates the JSON schema automatically. The model receives it in whatever format it expects (Claudeās tool_use, OpenAIās function calling, Geminiās function declarations), because the executor handles translation. Your tool definition stays the same.
This approach is similar to how AI agents work in other frameworks ā defining capabilities that the model can invoke autonomously ā but with compile-time type safety that only Swift can provide.
Which Providers Support It?
As of June 2026, the following providers have shipped or announced LanguageModelExecutor conformances:
| Provider | Status | Models Available |
|---|---|---|
| Apple (on-device) | Shipping | AFM 3B, AFM 7B |
| Apple (PCC) | Shipping | AFM Cloud, AFM Cloud Pro |
| Anthropic | Shipping | Claude Sonnet 4, Opus 4 |
| Beta | Gemini 3.5 Flash, 3.1 Pro | |
| OpenAI | Announced | GPT-5.5, GPT-5 |
| Mistral | Beta | Medium 3.5, Large 2 |
| Ollama | Community | Any local model |
The community implementation for Ollama is particularly interesting for developers who want to run models locally on Apple Silicon. You get the same API whether youāre hitting Claudeās servers or a Llama model running on your Macās GPU.
LMP vs MCP: Different Problems, Same Ecosystem
This is the question I see most often, so letās clarify. MCP and LMP solve different problems:
MCP = How models connect to tools and data (model-to-world) LMP = How apps connect to models (app-to-model)
In practice, youāll use both. Your iOS app uses LMP to send a prompt to Claude. Claude uses MCP to call your appās search tool. The tool result flows back through MCP to Claude, and Claudeās response flows back through LMP to your app.
They compose naturally because Apple designed LMP with MCP in mind. The @Tool macro in LMP can also expose tools via MCP if you want external agents to call them.
Why This Matters for the Ecosystem
Before LMP, building a multi-model architecture on iOS was painful. Youād have Anthropicās SDK for one feature, Googleās AI SDK for another, and OpenAIās for a third. Each brought its own dependencies, its own error handling, its own streaming format.
LMP makes using multiple AI models trivial. Route simple queries to the on-device model (free, fast, private). Send complex reasoning tasks to Claude or Gemini. Fall back gracefully when a provider is down. All through one API, one error type, one streaming format.
For developers evaluating the build vs buy decision for AI features, LMP significantly reduces the ābuildā cost. Youāre not building provider integrations anymore ā youāre just picking executors.
Privacy and On-Device First
Appleās approach remains privacy-first. The on-device executor processes everything locally on the Neural Engine. No data leaves the device. For the PCC executor, Appleās Private Cloud Compute guarantees are in play ā encrypted in transit, processed in secure enclaves, no data retention.
For third-party executors, LMP includes a PrivacyManifest requirement. Each executor must declare what data it sends, where it goes, and how long itās retained. This integrates with iOSās existing privacy labels in the App Store.
If youāre building apps that handle sensitive data, this matters. You can use the on-device model for PII-adjacent tasks and only route to cloud models for tasks where youāve obtained user consent. The GDPR implications are handled at the architecture level rather than as an afterthought.
Getting Started Today
To start building with LMP today:
- Update to Xcode 18 ā LMP is part of the Foundation Models framework
- Target iOS 26+ ā The protocol requires the latest runtime
- Watch Session 339 ā Appleās WWDC session covers implementation in detail
- Start with on-device ā Get your prompts working with the free, local model first
- Add providers incrementally ā Drop in third-party executors as you need more capability
The simplest path is to build your features against the on-device model, then swap in cloud executors when you need more power. Because the API is identical, this requires zero code changes in your feature logic.
The Bigger Picture
LMP represents Appleās bet that AI model providers will proliferate, not consolidate. By making it trivial to swap between models, Apple ensures that no single provider gains lock-in on their platform. Developers benefit from competition on quality and pricing. Users benefit from apps that can use the best model for each task.
The open-source commitment (planned for later 2026) means this isnāt just an Apple-ecosystem standard. If LMP gains adoption as an open protocol, it could become the standard way any Swift application ā macOS, visionOS, even server-side Swift ā talks to language models.
For now, if youāre building iOS apps with AI features, LMP is the path forward. One API, any model, type-safe tools, privacy-first architecture. The fragmentation era of iOS AI development is over.
Frequently Asked Questions
Is the Language Model Protocol the same as MCP?
No. LMP standardizes how your app talks to models (app-to-model communication). MCP standardizes how models talk to external tools and data sources (model-to-world communication). Theyāre complementary protocols ā use LMP to connect your app to any LLM, and MCP to give that LLM access to tools.
Can I use Language Model Protocol with models running locally on my Mac?
Yes. The community has already built LanguageModelExecutor conformances for Ollama, which means any model you can run locally on Apple Silicon (Llama, Qwen, Mistral, etc.) works through the same API. For on-device iOS inference, Appleās built-in executor handles their Foundation Models natively.
Do I need iOS 26 to use LMP?
Yes. The FoundationModels framework and LanguageModelExecutor protocol require iOS 26 or later. If you need to support older iOS versions, youāll need to maintain separate code paths or wait until your minimum deployment target catches up.
Is LMP free to use, or does Apple charge for it?
The protocol itself is free. On-device inference through Appleās Foundation Models is free ā it runs on the userās hardware. Appleās Private Cloud Compute (PCC) models are included for users with Apple Intelligence enabled. Third-party models (Claude, Gemini, etc.) require your own API keys and you pay those providers directly.
When will LMP be open-sourced?
Apple announced at WWDC 2026 that the framework will be open-sourced later in 2026. No exact date has been given. The open-source release will allow the protocol to be used outside Apple platforms, potentially in server-side Swift applications.
How does tool calling work across different providers?
You define tools once using the @Tool macro in Swift. The macro generates a provider-agnostic schema. Each LanguageModelExecutor implementation translates that schema into whatever format its model expects (Claudeās tool_use blocks, OpenAIās function calling, etc.). Your tool code is written once and works with every provider.