Host Coder to run your AI Orchestrator

By 👤 DANIEL SAMSON, 🤖 CO-AUTHORED-BY: CLAUDE OPUS 4.6 <NOREPLY@ANTHROPIC.COM> · 2026-04-12

#ai #coding agents #devops #self-hosting

I wanted to know how far I could push autonomous coding agents if I gave them real infrastructure instead of my laptop. The answer turned into a small system I call the SDLC Orchestrator: open an issue in Gitea, and Claude Code agents take it through the whole software lifecycle — analyse, architect, develop, review, test, deploy — each stage running in its own throwaway Coder workspace on my own cluster.

The shape of it

It's a webhook-driven loop. Gitea fires a webhook, a tiny orchestrator turns that into a job, and Coder spins up an ephemeral workspace running Claude Code to do exactly one stage of work:

Gitea sends a webhook (issue opened, label added, PR review submitted).
The orchestrator creates a Coder workspace via the Coder API, parameterised with the issue number, the task type, and the tokens it needs.
The workspace clones the repo, reads the matching slash command, and runs Claude Code headless against the issue.
Claude does the work and posts its results straight back to the Gitea issue or PR.
When it's done it calls back to the orchestrator, which deletes the workspace via the Coder API. Nothing persists.

Why host Coder for this

Coder gives me on-demand, templated, fully isolated development environments on my own hardware. That's exactly what an agent needs and exactly what you don't want running on your laptop. Each task gets a clean workspace built from a known image (Node, Claude Code, Playwright already baked in), does its thing, and is destroyed. The environments are reproducible, disposable, parallel, and the code never leaves my infrastructure.

The SDLC, as triggers

Each stage is just a Gitea event mapped to a Claude Code slash command, and — crucially — to a bot account:

Analyse — issue opened → the analyst reads it and posts clarifying questions and acceptance criteria.
Architect — label ready-for-architecture → produces a design doc and spec.
Develop — label ready-for-development → implements and opens a pull request.
Review — PR opened → reviews the diff and approves or requests changes.
Rework — changes requested → addresses the feedback.
Test — review approved → writes a test report and extra tests.
Deploy — label ready-for-deployment → runs migrations and deployment.

The slash commands live in .claude/commands/ in each repo, so the agent's "job description" for every stage is version-controlled alongside the code it operates on. Adding a new repo to the pipeline is mostly dropping those command files in and wiring up the webhooks.

Two bots, because the author shouldn't grade their own homework

There are two Gitea bot accounts: claude-dev does the writing (analyse, develop, rework, deploy) and claude-review does the reviewing and testing. Keeping them separate means the review stage is a genuinely independent pass, not the same agent rubber-stamping its own pull request.

The token gotcha

The workspaces authenticate with a Claude Code Max OAuth token from claude setup-token, not an API key — so the whole pipeline runs on the subscription I already pay for rather than burning pay-per-use credits. The trap that cost me an afternoon: if ANTHROPIC_API_KEY is set anywhere in the workspace it silently takes priority over the OAuth token and you're back on metered billing. Don't set it.

Making it survive reality

Webhooks lie, networks drop, and clones fail, so most of the work was resilience rather than the happy path:

Tasks are keyed by {stage}-{repo}-{issue} so duplicate webhooks are dropped.
On boot the orchestrator asks Coder which workspaces are still running and re-adopts them into its queue.
The workspace startup script uses a trap so the completion callback always fires — even if the clone itself failed — and the orchestrator can clean up.
The Coder template has a one-hour auto-stop as a backstop for any callback that never arrives.

The stack is tiny

The orchestrator itself is unglamorous and that's the point: Hono for the webhook routes, BullMQ on Redis for the job queue, the Coder and Gitea REST APIs, deployed to k3s via Fleet GitOps. A few hundred lines of TypeScript standing between "I opened an issue" and "there's a reviewed, tested pull request waiting for me".

Is it overkill for a one-person operation? Completely. But watching a Gitea issue quietly grow clarifying questions, then a design, then a PR, then a review — all on hardware sitting in my own rack — is the most fun I've had with coding agents yet.