πŸ€– AI Tools
Β· 2 min read

How to Replace GitHub Copilot for Free β€” Step-by-Step Guide (2026)


GitHub Copilot costs $10-19/month and sends your code to Microsoft’s servers. Here’s how to replace it with a free, private alternative that runs entirely on your machine. Total setup time: 10 minutes.

What you’ll get

  • Inline code completion (tab to accept, just like Copilot)
  • AI chat sidebar for code questions
  • Code explanation, refactoring, and test generation
  • Works offline
  • Your code never leaves your machine
  • $0/month, forever

Requirements

  • VS Code (or JetBrains)
  • 16GB+ RAM (for basic models) or 24GB+ VRAM (for best quality)
  • 10 minutes

Step 1: Install Ollama

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Windows: download from ollama.com

Step 2: Download a coding model

Pick based on your hardware:

# 8GB RAM β€” basic but functional
ollama pull qwen3.5:9b

# 12-16GB VRAM β€” great autocomplete
ollama pull codestral

# 24GB VRAM β€” best overall quality
ollama pull qwen2.5-coder:32b

Step 3: Install Continue in VS Code

  1. Open VS Code
  2. Go to Extensions (Ctrl+Shift+X)
  3. Search β€œContinue”
  4. Install β€œContinue - Codestral, Claude, and more”
  5. Click the Continue icon in the sidebar

Step 4: Configure Continue

Click the gear icon in Continue and set up your models:

If you have 24GB VRAM (best setup):

{
  "tabAutocompleteModel": {
    "title": "Codestral",
    "provider": "ollama",
    "model": "codestral"
  },
  "models": [
    {
      "title": "Qwen Coder 32B",
      "provider": "ollama",
      "model": "qwen2.5-coder:32b"
    }
  ]
}

If you have 16GB or less:

{
  "tabAutocompleteModel": {
    "title": "Qwen 9B",
    "provider": "ollama",
    "model": "qwen3.5:9b"
  },
  "models": [
    {
      "title": "Qwen 9B",
      "provider": "ollama",
      "model": "qwen3.5:9b"
    }
  ]
}

Step 5: Start coding

That’s it. Open any code file and start typing. You’ll see inline suggestions appear, just like Copilot. Press Tab to accept.

Use the chat sidebar (Ctrl+L) to ask questions about your code, request refactoring, or generate tests.

How it compares

After a week of using this setup vs Copilot:

  • Autocomplete quality: Codestral is arguably better than Copilot for inline suggestions (95.3% FIM accuracy)
  • Chat quality: Qwen Coder 32B matches GPT-4o on coding benchmarks
  • Speed: Local inference is instant β€” no network latency
  • Context awareness: Copilot is better at understanding your full project. Local models see less context.
  • Multi-file edits: Copilot is better here. Local models work best on single-file tasks.

For 80% of daily coding work β€” writing functions, fixing bugs, generating boilerplate β€” the free setup is indistinguishable from Copilot.

Troubleshooting

Suggestions are slow: Your model is too large for your hardware. Drop to a smaller model.

No suggestions appearing: Make sure Ollama is running (ollama serve) and the model is downloaded.

Quality is poor: Upgrade to a larger model if your hardware allows it. The jump from 9B to 32B is significant.

Related: Self-Hosted AI for Enterprise Β· How to Choose an AI Coding Agent Β· Git Cheat Sheet