I’ve been building software with AI tools for over two years now. In that time I’ve tried dozens of tools, burned through credits on models that weren’t worth it, and slowly landed on a setup that actually works for how I think and code.
This isn’t a “best tools” listicle. This is what I actually use every day, why I picked each tool, and the prompts I’ve refined through trial and error. If you want the broader landscape, check out my roundup of the best AI coding tools in 2026. This post is just my personal stack.
The Five Tools I Use Daily
Here’s the short version:
- Claude Code — heavy lifting, refactors, multi-file changes
- Cursor — daily editing, quick fixes, inline completions
- Claude.ai — research, writing, thinking through problems
- Ollama — private tasks, local experimentation, offline work
- Gemini — free tier research, long context analysis
Each one fills a specific gap. I tried consolidating down to two or three tools, but I kept running into situations where the wrong tool for the job cost me more time than switching.
Claude Code: The Heavy Lifter
Claude Code is where I go when the task is bigger than a single file. Refactoring a module, migrating an API, adding a feature that touches six files — that’s Claude Code territory.
I run it in the terminal alongside my editor. The key advantage is that it understands your entire project context. Point it at a codebase and it can reason across files in a way that inline editor tools still struggle with.
I wrote a full guide on how I use Claude Code if you want the deep dive, but here’s my typical workflow:
- I describe the change I want in plain language
- I let it propose the plan before writing code
- I review the diff, iterate if needed, then apply
The prompt template that works best for me:
I need to [describe change]. The relevant files are [list files or describe area].
Before writing any code, outline your plan. Then implement it file by file,
explaining each change. Don't modify tests unless I ask.
That last line matters. Without it, Claude Code will “helpfully” update your test files to match its changes, which defeats the purpose of tests catching regressions.
I pair Claude Code with Claude Opus 4 for complex refactors. For simpler multi-file tasks, Sonnet is faster and cheaper. The model choice matters more than people think — I’ve seen Opus catch architectural issues that Sonnet breezes past.
What I spend: Roughly $60–80/month on Claude Code usage. The Max plan covers most of it. Worth every dollar for the refactoring work alone.
Cursor: The Daily Driver
Cursor is my editor. I switched from VS Code about a year ago and haven’t looked back. The AI features are baked into the editing experience in a way that feels natural rather than bolted on.
I use Cursor for:
- Quick inline edits (Cmd+K is muscle memory at this point)
- Tab completions while writing new code
- Small, single-file changes
- Exploring unfamiliar codebases with chat
For a detailed comparison of when I reach for Cursor vs Claude Code, I wrote a head-to-head breakdown. The short version: Cursor for speed, Claude Code for scope.
My go-to Cursor prompts:
For inline edits:
Refactor this function to use early returns instead of nested ifs
For chat:
Explain what this function does, then suggest how to make it more readable.
Don't change the behavior.
I keep Cursor’s prompts short and specific. Long prompts in an inline editor context tend to produce worse results than concise ones. Save the detailed instructions for Claude Code.
What I spend: $20/month for Cursor Pro. The free tier is usable but the Pro completions are noticeably better.
Claude.ai: Research and Writing
This might surprise people, but I use Claude.ai (the web/app interface) more for writing and research than for coding. When I need to think through an architecture decision, draft documentation, write a blog post, or research a topic, Claude.ai is where I go.
The conversation format is better for iterative thinking than a code editor. I can paste in a design doc, ask questions, refine my understanding, and then take that clarity back to my code tools.
Prompts I use regularly:
For research:
I'm evaluating [technology/approach] for [use case]. Give me the honest trade-offs,
not a sales pitch. What are the failure modes? What do people regret after adopting it?
For writing:
I'm writing about [topic] for an audience of [description]. Here's my rough draft:
[paste draft]. Make it clearer and more direct. Cut anything that doesn't earn its place.
Don't make it sound like AI wrote it.
For architecture decisions:
I'm building [system description]. I'm choosing between [option A] and [option B].
Here are my constraints: [list]. Which would you pick and why?
Push back if my constraints don't make sense.
That “push back” line is important. Without it, Claude tends to validate whatever you’re leaning toward. With it, you get genuinely useful counterarguments.
What I spend: Included in my Anthropic plan, so effectively $0 extra beyond what I pay for Claude Code.
Ollama: Private and Local
I run Ollama on my machine for two reasons: privacy and experimentation.
Some tasks involve proprietary code or sensitive data that I don’t want leaving my machine. Ollama with a good local model handles those. I also use it to test prompts against different models before committing to an API call, and to experiment with new open-source models as they drop.
I have a complete Ollama setup guide that covers installation and model management. Here’s what I actually run:
- Llama 3.3 70B (quantized) for general coding tasks that need to stay local
- CodeGemma for quick completions when I’m offline
- Mistral variants for testing prompt compatibility across models
The quality gap between local and cloud models has shrunk dramatically, but it’s still there. I don’t use Ollama for complex refactors — that’s still Claude Code’s job. But for “rewrite this function,” “generate test data,” or “explain this regex,” local models are fast and free.
What I spend: $0 in API costs. The real cost is the GPU — I run a machine with 32GB VRAM, which handles 70B quantized models comfortably.
Gemini: Free Tier and Long Context
Gemini fills a specific niche in my workflow: tasks that need massive context windows and tasks where I don’t want to burn Claude credits.
Google’s free tier is generous, and Gemini’s context window is enormous. When I need to analyze a full codebase dump, review a long document, or process a large dataset, Gemini handles it without me worrying about token costs.
I use it for:
- Analyzing entire log files or error dumps
- Reviewing long PRs or documentation
- Quick questions I’d feel silly spending API credits on
- Second opinions when Claude gives me an answer I’m not sure about
What I spend: $0. The free tier covers my usage. I’ve never needed to upgrade.
Monthly Cost Breakdown
Here’s what my AI tooling actually costs:
| Tool | Monthly Cost | What I Get |
|---|---|---|
| Claude Code (Max plan) | ~$60–80 | Heavy refactors, multi-file work |
| Cursor Pro | $20 | Daily editing, completions |
| Claude.ai | Included | Research, writing, thinking |
| Ollama | $0 | Private tasks, experimentation |
| Gemini | $0 | Long context, free tier research |
| Total | ~$80–100/month |
For a deeper look at how these costs compare across the market, see my AI coding tools pricing breakdown.
Is $100/month a lot? Compared to my pre-AI workflow, I estimate I save 8–12 hours per week. At any reasonable hourly rate, that’s a no-brainer.
What I Tried and Dropped
Not everything stuck. Here’s what I used and moved away from:
- GitHub Copilot — Cursor’s completions are better for my workflow, and I didn’t want to pay for both. Copilot is still solid, it just became redundant.
- ChatGPT — I used it heavily in 2024. Claude overtook it for my use cases (coding and technical writing) and I gradually stopped opening it. The reasoning models are impressive but I found Claude more reliable for code.
- Codeium/Windsurf — Tried it for a month. Good product, but Cursor had more momentum and better model access. Switching costs weren’t worth the marginal differences.
- Replit Agent — Interesting for prototyping but I never trusted it for production code. The “build me an app” workflow doesn’t match how I actually develop software.
- Local code completion plugins — Before Ollama got good, I tried several local completion plugins. Most were too slow or too inaccurate to be useful. Ollama with modern models finally made local viable.
Advice If You’re Building Your Own Stack
Don’t try to use one tool for everything. Each tool in my stack is the best at one specific thing. The switching cost between them is minimal — a few seconds — and the quality difference is significant.
Start with two tools: one for editing (Cursor) and one for bigger tasks (Claude Code). Add the others as you hit specific needs. You’ll know when you need local inference or a bigger context window because you’ll feel the friction.
And invest time in your prompts. The difference between a vague prompt and a specific one is often the difference between useful output and garbage. Every prompt template I shared above took dozens of iterations to get right.
The tools will keep changing. The workflow principles won’t: use the right model for the task, be specific about what you want, and always review what AI gives you before shipping it.