Claude Implement, Codex Review

MIT↓ 0 downloads

v1.0.0

A mixed-provider pipeline: a Claude Code agent implements a task, then a Codex reviewer critiques and fixes it on the same warm branch, looping until it approves.

#review-pipeline #mixed-provider #codex #claude-code #docker

Topology

Implement (Claude Code)

implementer

Review (Codex)

reviewer

Disclosures

Disclosures — declared side-effect surface

Everything below runs on your machine or inside the sandbox when you use this workflow. Mismatches between these declarations and the actual code block publishing.

Host hooks

Commands executed on YOUR host machine by Sandcastle lifecycle hooks.

None declared.

Sandbox hooks

Commands executed inside the sandbox container.

npm install

Network access

None. Both agents operate only on the local repository inside the sandbox.

Shell expansion

No shell-expansion blocks in prompt files.

Files

Diff vs the stock Sandcastle 0.12.0 template Dockerfile — green lines were added by the author, red lines were removed from stock.

+# Sandbox image for the Claude Implement, Codex Review workflow.
+# Installs both agent CLIs — Claude Code and Codex — on the stock Sandcastle base.
 FROM node:22-bookworm
  
 # System dependencies.
 RUN apt-get update && apt-get install -y --no-install-recommends \
       git \
       curl \
       jq \
       ca-certificates \
  && rm -rf /var/lib/apt/lists/*
  
-# Claude Code CLI (the agent runtime).
-RUN npm install -g @anthropic-ai/claude-code
+# Agent CLIs: Claude Code (implementer) and Codex (reviewer).
+RUN npm install -g @anthropic-ai/claude-code @openai/codex
  
 # Non-root agent user. `sandcastle docker build-image` aligns AGENT_UID/GID to
 # the host user via --build-arg to avoid permission errors on bind mounts.
 ARG AGENT_UID=1000
 ARG AGENT_GID=1000
 RUN groupadd --gid ${AGENT_GID} agent \
  && useradd --uid ${AGENT_UID} --gid ${AGENT_GID} --create-home --shell /bin/bash agent
  
 USER agent
 WORKDIR /workspace

Show full Dockerfile (highlighted)

# Sandbox image for the Claude Implement, Codex Review workflow.
# Installs both agent CLIs — Claude Code and Codex — on the stock Sandcastle base.
FROM node:22-bookworm

# System dependencies.
RUN apt-get update && apt-get install -y --no-install-recommends \
      git \
      curl \
      jq \
      ca-certificates \
 && rm -rf /var/lib/apt/lists/*

# Agent CLIs: Claude Code (implementer) and Codex (reviewer).
RUN npm install -g @anthropic-ai/claude-code @openai/codex

# Non-root agent user. `sandcastle docker build-image` aligns AGENT_UID/GID to
# the host user via --build-arg to avoid permission errors on bind mounts.
ARG AGENT_UID=1000
ARG AGENT_GID=1000
RUN groupadd --gid ${AGENT_GID} agent \
 && useradd --uid ${AGENT_UID} --gid ${AGENT_GID} --create-home --shell /bin/bash agent

USER agent
WORKDIR /workspace

# Auth for the Codex reviewer (always required).
OPENAI_API_KEY=

# Auth for the Claude Code implementer.
# Provide exactly ONE of the following two. Run `claude setup-token` for the
# OAuth token, or use a standard Anthropic API key instead.
CLAUDE_CODE_OAUTH_TOKEN=
ANTHROPIC_API_KEY=

Implement the task

You are the implementer. Read TASK.md in the repository root — it describes a single, well-scoped change to make in this codebase.

Implement the change with small, focused commits. Add or update tests to cover the new behaviour, and keep the existing test suite green.

Do not review your own work adversarially — a separate reviewer agent will do that next. Just make the change correct and well-tested.

When the implementation is complete and the tests pass, output the exact line <promise>COMPLETE</promise> and stop.

import { createSandbox, claudeCode, codex } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

// One warm sandbox shared by both agents. `await using` tears it down on exit;
// if the branch has uncommitted changes the worktree is preserved on disk.
await using sandbox = await createSandbox({
  branch: "agent/codex-review",
  sandbox: docker({ imageName: "sandcastle:codex-reviewer" }),
  hooks: {
    sandbox: {
      onSandboxReady: [{ command: "npm install", timeoutMs: 300_000 }],
    },
  },
});

// 1. Implement with Claude Code.
const implementation = await sandbox.run({
  agent: claudeCode("claude-sonnet-4-6"),
  promptFile: ".sandcastle/implement-prompt.md",
  maxIterations: 5,
});

console.log(`Implementer made ${implementation.commits.length} commit(s).`);

// 2. Review with Codex on the same branch and container, looping until it
//    approves the change.
const review = await sandbox.run({
  agent: codex("gpt-5.4", { effort: "high" }),
  promptFile: ".sandcastle/review-prompt.md",
  maxIterations: 3,
  completionSignal: "<promise>APPROVED</promise>",
});

console.log(`Reviewer finished with signal: ${review.completionSignal ?? "none"}`);

Review and fix the implementation

You are an adversarial reviewer working on the same branch the implementer just finished. Read TASK.md for the original intent, then review every change made against it.

Look hard for missing edge cases, weak or missing tests, regressions, and security issues. When you find a problem, fix it directly on this branch with a focused commit and re-run the tests.

When you are satisfied the change is correct, complete, and well-tested, output the exact line <promise>APPROVED</promise> and stop.

README

Claude Implement, Codex Review

A two-agent, mixed-provider pipeline that shows how to combine Anthropic and OpenAI models in a single Sandcastle workflow. A Claude Code (Sonnet) agent implements the task described in implement-prompt.md, then a Codex (gpt-5.4) reviewer critiques the change and fixes any problems on the same warm branch inside the same container, looping until it emits <promise>APPROVED</promise>.

Why `createSandbox()`

Both agents run inside one long-lived sandbox created with createSandbox(). The container and its installed dependencies persist between the implement and review steps, so the reviewer sees exactly the state the implementer left behind without paying container start-up costs twice. The await using binding tears the sandbox down automatically when the script exits.

Environment variables

This workflow needs credentials for both providers:

The Codex reviewer requires OPENAI_API_KEY (always).
The Claude Code implementer requires one of CLAUDE_CODE_OAUTH_TOKEN (from claude setup-token) or ANTHROPIC_API_KEY. Both are listed as optional in the manifest because either one satisfies the implementer — set exactly one.

Requirements

Fill in .sandcastle/.env, build the image with npx @ai-hero/sandcastle docker build-image, then run npx tsx .sandcastle/main.ts. No network access beyond the model APIs is required — the agents only touch the local repository.

Install

npx runcastle add castellan-demo/codex-reviewer

Writes files into your repo and prints instructions. Nothing executes at install time; the preview shows every disclosure first.

Compatibility

Sandcastle: >=0.12.0 <0.13.0
Package: @ai-hero/sandcastle
Entrypoint: .sandcastle/main.ts

Sandbox providers

docker

Environment variables

Variable	Required	Description
CLAUDE_CODE_OAUTH_TOKEN	optional	Auth for the Claude Code implementer. Provide this (run `claude setup-token`) OR ANTHROPIC_API_KEY — one of the two is required.
ANTHROPIC_API_KEY	optional	Alternative auth for the Claude Code implementer. Provide this OR CLAUDE_CODE_OAUTH_TOKEN — one of the two is required.
OPENAI_API_KEY	required	Auth for the Codex reviewer.

Post-install notes

Set OPENAI_API_KEY plus one of CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_API_KEY in `.sandcastle/.env`, then run `npx @ai-hero/sandcastle docker build-image` and `npx tsx .sandcastle/main.ts`.

Versions

v1.0.0Jul 4, 2026bbbbbbbviewing

Source: castellan-demo/workflows

Topology

Disclosures

Files

Implement the task

Review and fix the implementation

README

Claude Implement, Codex Review

Why createSandbox()

Environment variables

Requirements

Why `createSandbox()`