I write code on four SaaS in production, plus side projects, plus the studio site. The bottleneck is never typing speed. It is context. Context across files in one product, across products in the studio, across infrastructure decisions made a year ago that I have to honor today. For 18 months we tried different AI coding tools to manage that context. Today we use Claude Code coding agent for almost everything, with persistent skills, the 1M-context Opus 4.7 model, and a setup that compounds across products. This is the honest tour, including the tools we left behind.
What we tried first
Three rounds before landing on Claude Code.
Round 1: GitHub Copilot
Copilot is a great autocomplete. It saves keystrokes inside a single file. It does not understand the project. The version we used in 2023 had no real cross-file awareness and no way to feed it our brand voice, our product structure, or our deployment quirks. For tab-completing a function signature, fine. For "rewrite this Postgres query to handle the multi-tenant filter and add the index hint", not the right tool.
We still keep Copilot enabled in some editors as a typing helper. It does not do real work for us anymore.
Round 2: ChatGPT (web)
Pasting code into ChatGPT and pasting answers back. The single worst workflow on this list. Context is lost the moment you leave the browser tab. The model's answers are decoupled from your repo. It hallucinates file paths. It tells you "now run npm install" without knowing that you already did. We do not use ChatGPT for code work anymore. Our feedback note in our internal memory on this is unambiguous: do not use ChatGPT for code, regardless of model version.
We do not use it for content production either. The voice generic-ifies fast.
Round 3: Cursor
Cursor was a real step up. Native IDE integration, repository awareness, agent-like flows. We used it heavily for a few months. Two reasons we left:
- Context window limits. When a project crossed a certain size, we hit walls. The agent would forget context within long sessions, especially on tasks that required holding many files in mind at once. The advertised context did not match the practical context after the agent had been running for a while.
- Pricing. Cursor's pricing scaled with usage in ways that were hard to predict. We are a small studio with a tight budget. Predictable cost is part of our operating model.
The combination meant Cursor was great when it worked and frustrating when it did not, with a cost structure we could not plan around.
Why Claude Code worked
We moved to Claude Code (Anthropic's official CLI for Claude) in early 2025. Three things made the difference, in this order:
1. Persistent skills and memory
The Claude Code harness lets us write skills (small named procedures with instructions) that the agent invokes when relevant. We have a skill for shipping a Coolify deploy. A skill for diagnosing a Docker DNS hiccup. A skill for running our Carriva deploy script over SSH. A skill for generating a SEO content brief.
The skills do not just sit there. They are invoked when the agent recognizes the trigger. Combined with a memory file that persists user preferences, project facts, and feedback across conversations, the result is an agent that gets smarter for our specific workflow over time. After months of use, the agent knows we prefer not to use em-dashes, that Carriva deploys differently from Coolify-managed apps, that the lesson-planning monorepo follows a specific multi-tenant pattern, and that we do not store credentials in memory files. For Carriva specifically, where the work involves RAG in regulated industries, the persistent context about pension regulation terminology and audit conventions saves hours per week.
That is the difference between an AI tool and a coding partner. The first answers questions in isolation. The second has been with you long enough to know how you like things done.
2. The 1M-context window
Opus 4.7 with the 1M-context window handles tasks that would have required heroic prompt engineering on smaller-context models. We can drop the entire schema, the relevant route handlers, the brand-voice guide, and the prior 6 articles into one session. The model reasons across all of it without losing the thread.
For a multi-product studio, this is not a luxury. It is a structural advantage. Asking "is this term consistent with how we treat the same concept in the other 4 products?" requires the agent to actually see all four products. Smaller context windows answer with a generic guess. The 1M-context window answers with a specific reference to a file we wrote 8 months ago.
3. The CLI surface
Claude Code is a CLI. It runs in a terminal. It calls real tools. It executes commands. It edits files. It is not a chat that suggests code; it is an agent that does the work and shows you the diff.
The CLI surface forces a different workflow. You stop copy-pasting. You stop context-switching between editor and chat. You stay in the terminal where the work happens. For us, that is roughly a 30% reduction in friction on engineering tasks.
The right surface for an AI coding tool is not chat. It is the place where the work already happens. For us that is the terminal.
What the workflow actually looks like
A real-world example. We needed to add a new field to the Carriva audit report (a flag indicating whether the verifier rejected one of the findings). Here is the rough flow:
- We open Claude Code in the Carriva repo.
- We say "add a
verifier_rejectedboolean to the findings table, default false, surfaced in the report UI when true with a small badge." - The agent runs
psqlto inspect the schema, reads the relevant route handler, finds the report-rendering component, makes the changes, runs the test suite, and shows us the diff. - We review. We ask for one tweak (the badge color should match an existing utility class). The agent applies it.
- We commit and push.
Total elapsed time: maybe 12 minutes. Without Claude Code, that is a 45-minute task because of the context-loading overhead. Across a day, the savings compound into hours.
Where Claude Code is not the right tool
To be fair:
- Tab completion. It is not Copilot. For typing inside a single file, Copilot or even built-in autocomplete is faster.
- Heavy refactors that span 50 files. Sometimes the right tool is a Codemod or jscodeshift. Claude Code can drive these but it is not always the most efficient tool to write the actual transform.
- Pure thinking. When I am genuinely stuck on a design decision and I need to write down my thoughts, sometimes a notebook and a coffee is the right tool. The agent is good at running things. It is less good at being silent.
We use the agent for roughly 80% of engineering work. The other 20% remains old-school.
The skill library that compounds
Our skill library is one of the most valuable assets we have built. Examples:
coolify: ship an app to the Coolify-managed homelab and diagnose the recurring AdGuard DNS failures.deploy-carriva: run the Carriva-specific SSH deploy script.searchfit-seo:create-content: generate SEO content with our brand voice and product context.fewer-permission-prompts: scan transcripts for routine read-only commands and add them to a project allowlist.sentry:sentry-workflow: triage and fix production issues with Sentry context.
Each skill is small. The compounding is large. After 6 months of use, the agent has skills covering most of our recurring operational shapes. New work fits into existing skills more often than not.
This is structurally similar to the moats we discussed in our piece on vertical AI SaaS: the value is in the accumulated, specific knowledge, not in the underlying model.
Why we use it for content too
A note for the non-engineers reading this: we use Claude Code for content production too, not just code. The same workflow that handles code also handles articles. We have a content skill that knows the brand voice, the prior articles, and the product context. We feed it a topic spec and it produces a draft.
The economics of this are documented in our AI content marketing cost piece. The point here is that the same tool covers two-thirds of the studio's daily work. That consolidation is itself a productivity gain.
The non-obvious benefits
Three benefits I did not anticipate when we made the switch:
1. Onboarding new contractors is faster
A short-term contractor lands on a project. We point them at the skill library and the memory. They have a 30-minute orientation and they are productive. The institutional knowledge of the studio is partially encoded in skills, not just in our heads.
2. The same agent helps with cross-product work
A lot of our work touches multiple products. "Update the privacy policy across all four products" is a real task we do. Claude Code's persistent context lets us do it once. The agent knows our four products, knows the policy template, and applies the change consistently. The mechanics of running four SaaS in parallel lean heavily on this kind of cross-product agent work.
3. We catch our own mistakes faster
The agent reading our code is, in effect, a permanent code review buddy. We have caught bugs in our own work because the agent flagged them while doing something tangentially related. Not always. Often enough to matter.
What about cost?
Claude Code is not free. We pay for the model usage. Our monthly LLM cost across all studio work (engineering + content + research) lands in a few hundred euros. The trade is wildly favorable: a few hundred euros for what is effectively a force-multiplier on a one-person engineering team.
For comparison, hiring a junior engineer at the same productivity gain would be 4,000 EUR per month plus benefits, plus the management overhead. We are not anti-hire. We are realistic about the math at our scale.
Mistakes we still make
We are not perfect with this tool. Two recurring mistakes:
- Over-trusting the agent on highly novel tasks. When we ask it to do something genuinely new (a new pattern, a new framework version), it gets it 80% right and we spend time on the last 20%. We have learned to slow down on novel tasks and review more carefully.
- Letting the skill library go stale. Skills accumulate. If we do not prune the ones we no longer use, the agent gets noisy. We do a quarterly skill cleanup.
What we would tell another founder
Three things:
- If you are using ChatGPT for code, stop. It is the worst tool for engineering work in 2026. You will save hours per week by switching.
- If you are happy with Cursor, fine. It is a real tool. We left for specific reasons (context limits, pricing) that may or may not apply to you.
- If you have not tried Claude Code, try it for two weeks on a real project. The first few days are awkward. The compounding kicks in by week 2. After a month, it is hard to go back.
What is next
For us, more skills, less custom code. Anything we do twice should be a skill. Anything we explain twice to a contractor should be in the memory. The marginal hour spent encoding our process pays off many times.
Claude Code coding agent is not a magic productivity hack. It is a power tool. Like every power tool, it works best when you take 20 minutes to learn it before you use it on the real job. Spend the 20 minutes. The next 20 hours you save will pay you back. We have been at this for a year and the trajectory is still up. We will write the follow-up post in 12 months when we know what year two looks like.



