Test Coverage Booster
MIT↓ 1 downloadsA single Claude Code agent that finds the least-tested modules in your repo, writes focused unit tests for them one module at a time, and verifies the suite stays green after every batch — stopping once it hits your coverage target.
Topology
Disclosures
Everything below runs on your machine or inside the sandbox when you use this workflow. Mismatches between these declarations and the actual code block publishing.
Host hooks
Commands executed on YOUR host machine by Sandcastle lifecycle hooks.
None declared.
Sandbox hooks
Commands executed inside the sandbox container.
npm install
Network access
None. The agent reads and writes only the local repository inside the sandbox; the sandbox hook runs `npm install` against your declared package registry.
Shell expansion
No shell-expansion blocks in prompt files.
Files
Diff vs the stock Sandcastle 0.12.0 template Dockerfile — green lines were added by the author, red lines were removed from stock.
+# Sandbox image for the Test Coverage Booster workflow.+# Node 22 + git + the Claude Code CLI, running as a non-root `agent` user.+# Add your project's toolchain (python, go, ...) here if your suite needs it.FROM node:22-bookworm-# System dependencies.RUN apt-get update && apt-get install -y --no-install-recommends \git \curl \jq \ca-certificates \&& rm -rf /var/lib/apt/lists/*# Claude Code CLI (the agent runtime).RUN npm install -g @anthropic-ai/claude-code# Non-root agent user. `sandcastle docker build-image` aligns AGENT_UID/GID to# the host user via --build-arg to avoid permission errors on bind mounts.ARG AGENT_UID=1000ARG AGENT_GID=1000RUN groupadd --gid ${AGENT_GID} agent \&& useradd --uid ${AGENT_UID} --gid ${AGENT_GID} --create-home --shell /bin/bash agentUSER agentWORKDIR /workspace
Show full Dockerfile (highlighted)
# Sandbox image for the Test Coverage Booster workflow.
# Node 22 + git + the Claude Code CLI, running as a non-root `agent` user.
# Add your project's toolchain (python, go, ...) here if your suite needs it.
FROM node:22-bookworm
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
curl \
jq \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Claude Code CLI (the agent runtime).
RUN npm install -g @anthropic-ai/claude-code
# Non-root agent user. `sandcastle docker build-image` aligns AGENT_UID/GID to
# the host user via --build-arg to avoid permission errors on bind mounts.
ARG AGENT_UID=1000
ARG AGENT_GID=1000
RUN groupadd --gid ${AGENT_GID} agent \
&& useradd --uid ${AGENT_UID} --gid ${AGENT_GID} --create-home --shell /bin/bash agent
USER agent
WORKDIR /workspace
# Auth for the Claude Code agent.
# Run `claude setup-token` on your host to generate a token, then paste it here
# in your local .sandcastle/.env (never commit the real .env).
CLAUDE_CODE_OAUTH_TOKEN=
import { createSandbox, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";
// A warm-sandbox coverage loop. We create one sandbox, install dependencies
// once, then alternate between an agent pass (write tests for the weakest
// module) and a hard `npm test` gate. Keeping the container warm across passes
// means dependencies and build artifacts are installed exactly once.
const IMAGE = "sandcastle:test-coverage-booster";
const MAX_ROUNDS = 6;
await using sandbox = await createSandbox({
branch: "agent/test-coverage",
sandbox: docker({ imageName: IMAGE }),
hooks: {
sandbox: {
onSandboxReady: [{ command: "npm install", timeoutMs: 300_000 }],
},
},
});
for (let round = 1; round <= MAX_ROUNDS; round++) {
const pass = await sandbox.run({
name: `coverage-round-${round}`,
agent: claudeCode("claude-opus-4-8", { effort: "high" }),
promptFile: ".sandcastle/prompt.md",
completionSignal: "<promise>TARGET_MET</promise>",
});
// The agent signals it has reached the coverage target — stop early.
if (pass.completionSignal) {
console.log(`Coverage target reached in round ${round}.`);
break;
}
// Hard gate: the suite must stay green before the next round starts.
// A non-zero exit code is returned (not thrown), so we can fail loudly.
const tests = await sandbox.exec("npm test");
if (tests.exitCode !== 0) {
throw new Error(
`Round ${round} left the suite red — aborting:\n${tests.stdout}\n${tests.stderr}`,
);
}
console.log(`Round ${round}: added tests, suite still green.`);
}
Improve test coverage, one module at a time
You are an autonomous test engineer working inside this repository. Your goal is to raise unit-test coverage toward 85% line coverage without changing product behaviour.
Each time you run:
- Run the project's coverage report (e.g.
npm test -- --coverage,pytest --cov,go test -cover, or whatever this repo uses — infer it from the config). - Identify the single least-covered source module that carries real logic (skip generated files, type-only files, and trivial re-exports).
- Write focused, meaningful unit tests for that module:
- Cover the main happy path, the important branches, and at least one edge case.
- Follow the repository's existing test framework, file layout, and naming.
- Do not modify the code under test. If a function is untestable as written,
leave a
// TODO(coverage):note in the test file instead of refactoring.
- Run the tests you just wrote and make sure they pass.
- Commit with a message like
test: cover <module>.
Keep each run tightly scoped to one module so every commit is easy to review.
When the overall line coverage is at or above 85% (or no meaningful untested
module remains), output the exact line <promise>TARGET_MET</promise> and stop.
README
Test Coverage Booster
Point a single Claude Code agent at a repository with thin test coverage and let it grind the number up — safely, one module at a time, never touching the code under test.
What it does
Low coverage is boring, high-value work that rarely gets prioritised. This
workflow automates the grind: on each pass the agent runs your coverage report,
finds the single least-covered module that carries real logic, writes focused
unit tests for its happy path, key branches, and an edge case, then commits. It
repeats until it reaches an 85% line-coverage target (or runs out of meaningful
untested code), at which point it emits <promise>TARGET_MET</promise> and stops.
How it works
main.ts creates one warm Docker sandbox with createSandbox() and installs
dependencies exactly once via an onSandboxReady hook. It then loops for up to
six rounds. Every round is an agent pass followed by a hard npm test gate run
through sandbox.exec() — if a round ever leaves the suite red, the run aborts
loudly instead of stacking broken tests. Because the container stays warm between
rounds, npm install and build artifacts are paid for once, not per round.
The topology is a tight loop: install → write tests → verify → back to write.
Requirements
Set CLAUDE_CODE_OAUTH_TOKEN in .sandcastle/.env (run claude setup-token on
your host). Your repo should expose a working npm test that reports coverage;
adjust the prompt if your stack uses pytest --cov, go test -cover, etc. Build
the image once with npx @ai-hero/sandcastle docker build-image, then run it with
npx tsx .sandcastle/main.ts.