πŸ€– AI Tools
Β· 2 min read

Where Does Your Code Go? Data Privacy for AI Coding Tools


Every time you press Tab in Cursor or ask Claude Code to fix a bug, your code travels somewhere. Here’s exactly where it goes for each tool.

The data flow

Your code β†’ Internet β†’ Provider's servers β†’ Model inference β†’ Response β†’ Your IDE
                              ↓
                    Logged? Stored? Used for training?

Provider-by-provider breakdown

Anthropic (Claude Code, Claude API)

  • Where: US servers (AWS)
  • Retention: 30 days (API), longer for consumer
  • Training: ❌ Not on API data. ⚠️ Consumer data may be used
  • DPA: Available for Team/Enterprise plans

OpenAI (Codex CLI, ChatGPT, API)

  • Where: US servers (Azure)
  • Retention: 30 days (API), longer for consumer
  • Training: ❌ Not on API data (since March 2023). ⚠️ ChatGPT data may be used unless opted out
  • DPA: Available for business plans

Google (Gemini CLI, Vertex AI)

  • Where: Configurable (US, EU, Asia)
  • Retention: Configurable
  • Training: ❌ Not on Vertex AI data. ⚠️ Free Gemini may be used
  • DPA: Available for Cloud customers

Mistral (Vibe CLI, La Plateforme)

  • Where: EU servers (France)
  • Retention: Per DPA terms
  • Training: ❌ Not on API data
  • DPA: Available, EU-native

Self-hosted (Ollama, vLLM)

  • Where: Your machine/server
  • Retention: You control it
  • Training: ❌ Impossible β€” model runs locally
  • DPA: Not needed

The risk matrix

ScenarioRisk levelWhy
Personal project with Cursor🟒 LowNo sensitive data
Startup using Claude API🟑 MediumNeed DPA, review terms
Enterprise with customer PII in codeπŸ”΄ HighNeed DPA + audit + possibly EU hosting
Healthcare/finance codebaseπŸ”΄ HighRegulatory requirements beyond GDPR
Using free ChatGPT for work codeπŸ”΄ HighNo DPA, data may be used for training

What to do

For personal projects: Use whatever you want. The risk is minimal.

For company code:

  1. Use API access (not consumer subscriptions)
  2. Get a DPA from your provider
  3. Consider Mistral for EU data residency
  4. Or self-host for maximum control

For regulated industries: Self-host with Ollama + Devstral Small or Qwen 3.5. No data leaves your network.

Related: AI and GDPR for Developers Β· Best AI Coding Agents for Privacy Β· Self-Hosted AI for GDPR