Deep Research Report

MIT↓ 0 downloads

v1.0.0

Fans out a searcher per sub-question to fetch and note public web sources, then an Opus synthesizer writes a fully cited report, then a Codex fact-checker adversarially verifies every claim against the notes and loops until the report holds up.

#research #fan-out #citations #mixed-provider #codex

Topology

Search (xN)

searcher

×N

Synthesize report

synthesizer

Fact-check (Codex)

fact-checker

Disclosures

Disclosures — declared side-effect surface

Everything below runs on your machine or inside the sandbox when you use this workflow. Mismatches between these declarations and the actual code block publishing.

Host hooks

Commands executed on YOUR host machine by Sandcastle lifecycle hooks.

cp .env.example .env

Sandbox hooks

Commands executed inside the sandbox container.

None declared.

Network access

Searchers fetch public web pages via curl.

Shell expansion

Prompt files contain !`command` blocks — the agent CLI executes these commands at prompt-load time. They are highlighted amber in the prompt files below.

Files

Diff vs the stock Sandcastle 0.12.0 template Dockerfile — green lines were added by the author, red lines were removed from stock.

+# Sandbox image for the Deep Research Report workflow.
+# Node 22 + git + curl (searchers fetch public web pages) + both agent CLIs —
+# Claude Code (searchers + synthesizer) and Codex (fact-checker) — running as a
+# non-root `agent` user.
 FROM node:22-bookworm
  
-# System dependencies.
 RUN apt-get update && apt-get install -y --no-install-recommends \
       git \
       curl \
       jq \
       ca-certificates \
  && rm -rf /var/lib/apt/lists/*
  
-# Claude Code CLI (the agent runtime).
-RUN npm install -g @anthropic-ai/claude-code
+# Agent CLIs: Claude Code (searchers + synthesizer) and Codex (fact-checker).
+RUN npm install -g @anthropic-ai/claude-code @openai/codex
  
 # Non-root agent user. `sandcastle docker build-image` aligns AGENT_UID/GID to
 # the host user via --build-arg to avoid permission errors on bind mounts.
 ARG AGENT_UID=1000
 ARG AGENT_GID=1000
 RUN groupadd --gid ${AGENT_GID} agent \
  && useradd --uid ${AGENT_UID} --gid ${AGENT_GID} --create-home --shell /bin/bash agent
  
 USER agent
 WORKDIR /workspace

Show full Dockerfile (highlighted)

# Sandbox image for the Deep Research Report workflow.
# Node 22 + git + curl (searchers fetch public web pages) + both agent CLIs —
# Claude Code (searchers + synthesizer) and Codex (fact-checker) — running as a
# non-root `agent` user.
FROM node:22-bookworm

RUN apt-get update && apt-get install -y --no-install-recommends \
      git \
      curl \
      jq \
      ca-certificates \
 && rm -rf /var/lib/apt/lists/*

# Agent CLIs: Claude Code (searchers + synthesizer) and Codex (fact-checker).
RUN npm install -g @anthropic-ai/claude-code @openai/codex

# Non-root agent user. `sandcastle docker build-image` aligns AGENT_UID/GID to
# the host user via --build-arg to avoid permission errors on bind mounts.
ARG AGENT_UID=1000
ARG AGENT_GID=1000
RUN groupadd --gid ${AGENT_GID} agent \
 && useradd --uid ${AGENT_UID} --gid ${AGENT_GID} --create-home --shell /bin/bash agent

USER agent
WORKDIR /workspace

# Auth for the Claude Code searchers and synthesizer.
# Run `claude setup-token` on your host to generate a token, then paste it here
# in your local .sandcastle/.env (never commit the real .env).
CLAUDE_CODE_OAUTH_TOKEN=

# Auth for the Codex fact-checker. Paste your OpenAI API key here in your local
# .sandcastle/.env (never commit the real .env).
OPENAI_API_KEY=

import { run, claudeCode, codex } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

const IMAGE = "sandcastle:deep-research-report";

// Edit these to your real research question and its sub-questions. Each
// sub-question fans out into its own searcher on its own branch, so add or
// remove freely. The synthesizer reads every committed notes file afterward.
const RESEARCH_QUESTION =
  "What are the durable competitive advantages of vertically integrated EV makers?";
const SUBQUESTIONS = [
  "How do battery manufacturing and supply chains create cost advantages?",
  "What role does proprietary charging infrastructure play in retention?",
  "How does software and over-the-air updating differentiate the product?",
];

// 1. Search — fan out one Sonnet searcher per sub-question, each on its own
//    branch. Distinct branches make the parallel fan-out safe. The host hook
//    seeds .env into every worktree. Each searcher fetches sources with `curl`
//    (see search-prompt.md) and writes research/notes-<i>.md.
const searches = await Promise.all(
  SUBQUESTIONS.map((subquestion, i) =>
    run({
      name: `search-q-${i + 1}`,
      agent: claudeCode("claude-sonnet-4-6"),
      sandbox: docker({ imageName: IMAGE }),
      promptFile: ".sandcastle/search-prompt.md",
      promptArgs: {
        RESEARCH_QUESTION,
        SUBQUESTION: subquestion,
        NOTES_INDEX: String(i + 1),
      },
      branchStrategy: { type: "branch", branch: `research/q-${i + 1}` },
      maxIterations: 3,
      hooks: {
        host: {
          onWorktreeReady: [{ command: "cp .env.example .env" }],
        },
      },
    }),
  ),
);

console.log(`Completed ${searches.length} searcher(s).`);

// 2. Synthesize — a single Opus synthesizer reads every committed notes file and
//    writes the cited report. (In a real pipeline you would merge the research/*
//    branches to HEAD first; here the synthesizer runs on HEAD and the prompt
//    points it at the research/ directory the searchers produced.)
const report = await run({
  name: "synthesize-report",
  agent: claudeCode("claude-opus-4-8", { effort: "high" }),
  sandbox: docker({ imageName: IMAGE }),
  promptFile: ".sandcastle/synthesize-prompt.md",
  promptArgs: { RESEARCH_QUESTION },
  maxIterations: 1,
  hooks: {
    host: {
      onWorktreeReady: [{ command: "cp .env.example .env" }],
    },
  },
});

console.log(`Report written in ${report.commits.length} commit(s).`);

// 3. Fact-check — a Codex fact-checker adversarially verifies every claim in the
//    report against the notes, flagging or fixing unsupported statements, and
//    loops until it is satisfied.
const verify = await run({
  name: "fact-check",
  agent: codex("gpt-5.4", { effort: "high" }),
  sandbox: docker({ imageName: IMAGE }),
  promptFile: ".sandcastle/verify-prompt.md",
  maxIterations: 4,
  completionSignal: "<promise>VERIFIED</promise>",
  hooks: {
    host: {
      onWorktreeReady: [{ command: "cp .env.example .env" }],
    },
  },
});

console.log(`Fact-checker finished with signal: ${verify.completionSignal ?? "none"}`);

Research searcher: sub-question {{NOTES_INDEX}}

You are a research analyst working on one slice of a larger question.

Overall research question: {{RESEARCH_QUESTION}}

Your assigned sub-question: {{SUBQUESTION}}

A starting source

The line below fetches a public web page to seed your research. Treat it as one input, not the whole answer — read it, then fetch additional sources yourself with the Bash tool (for example curl -sL --max-time 30 <url>) to corroborate and broaden your findings.

!curl -sL --max-time 30 "https://en.wikipedia.org/wiki/Special:Search?search=$(echo '{{SUBQUESTION}}' | tr ' ' '+')&go=Go"

Your task

Investigate {{SUBQUESTION}} and write your findings to research/notes-{{NOTES_INDEX}}.md with these sections:

Sub-question — restate what you were asked to investigate.
Key findings — 4–8 bullet points, each a specific, checkable claim.
Evidence — for every finding, cite the source URL and quote the exact sentence or figure that supports it. If you could not find support for a claim, mark it [UNVERIFIED] rather than dropping it.
Open questions — what you could not confirm and why.

Fetch multiple sources. Prefer primary and reputable sources. Never invent a URL or a quotation — if a fetch fails or returns little usable content, say so explicitly. Ground every finding in something you actually retrieved.

Commit research/notes-{{NOTES_INDEX}}.md with a message like research: notes for sub-question {{NOTES_INDEX}}, then output <promise>COMPLETE</promise>.

Synthesize the research report

You are a senior research editor. Read every notes file in the research/ directory (research/notes-*.md) and synthesize them into one coherent, well-cited report.

Overall research question: {{RESEARCH_QUESTION}}

Write research/REPORT.md with these sections:

Executive summary — the answer to the research question in 3–5 sentences, with the confidence level you have in it.
Findings by theme — organize the strongest findings across all notes files into themes (not one section per sub-question). Every claim must carry an inline citation to a source URL that appears in the notes.
Where sources agree / disagree — call out corroboration and any contradictions between the notes files.
Confidence & gaps — what is well-supported, what rests on a single source, and what remains unverified.
Sources — a deduplicated list of every URL cited, grouped by sub-question.

Only use claims and URLs that appear in the notes files. Do not introduce new facts, and do not invent citations. If the notes flagged something [UNVERIFIED], keep that flag in the report. If a sub-question's notes are thin or missing, note the gap rather than filling it with speculation.

Commit research/REPORT.md with the message research: synthesized report, then output <promise>COMPLETE</promise>.

Adversarial fact-check

You are a skeptical fact-checker. Your job is to verify that every claim in research/REPORT.md is actually supported by the evidence in the research/ notes files (research/notes-*.md) — and to fix the report where it is not.

Procedure

Read research/REPORT.md and all research/notes-*.md files.
For each substantive claim in the report, find the citation and confirm the cited notes file genuinely contains a quote or figure that supports it.
Flag any claim that is:
- unsupported by any notes file, or
- stronger than its evidence (overstated certainty), or
- citing a URL that does not appear in the notes.
Fix the report: soften overstated claims, add [UNSUPPORTED] inline where a claim cannot be backed, correct or remove bogus citations, and preserve every [UNVERIFIED] flag the searchers raised.
Append a short Fact-check log section to research/REPORT.md listing what you changed and why.

Do not invent new supporting evidence and do not add new claims. Your only sources of truth are the notes files that already exist.

Commit your changes to research/REPORT.md with a message like research: fact-check pass. When the report is fully consistent with the evidence and no unsupported claims remain unflagged, output <promise>VERIFIED</promise>.

README

Deep Research Report

Turn one big question into a cited, fact-checked report — without opening fifty browser tabs. This workflow fans out a searcher per sub-question, synthesizes the notes into a single report, then runs an adversarial fact-checker that verifies every claim against the evidence before you trust a word of it.

What it does

You give it a research question and a handful of sub-questions. For each sub-question, a Claude Code searcher fetches public web sources, extracts specific checkable claims, and records each one with the source URL and an exact supporting quote in research/notes-<i>.md. Anything it can't confirm is flagged [UNVERIFIED] rather than dropped.

A Claude Code synthesizer then reads every notes file and writes research/REPORT.md: an executive summary, findings organized by theme with inline citations, where sources agree and disagree, and an explicit confidence-and-gaps section. Finally a Codex fact-checker adversarially audits the report against the notes — softening overstated claims, flagging unsupported ones, and killing bogus citations — looping until the report holds up, then emitting <promise>VERIFIED</promise>.

How it works

main.ts fans out the searchers with Promise.all, each on its own research/q-<i> branch so the parallel runs never collide. Each searcher seeds itself with a !`curl` block in the prompt — this is why the manifest discloses network access and shell expansion. The synthesizer then runs once on HEAD, and the Codex fact-checker loops on its own output. The topology is a fan-out into a verify loop (searcher ×N → synthesizer → fact-checker↺).

Requirements

Set CLAUDE_CODE_OAUTH_TOKEN (run claude setup-token) and OPENAI_API_KEY in .sandcastle/.env. Edit RESEARCH_QUESTION and the SUBQUESTIONS list at the top of .sandcastle/main.ts. Build the image once with npx @ai-hero/sandcastle docker build-image, then run npx tsx .sandcastle/main.ts. Everything the searchers fetch is public web content; nothing is sent anywhere but the models.

Install

npx runcastle add runcastle/[email protected]

Writes files into your repo and prints instructions. Nothing executes at install time; the preview shows every disclosure first.

Compatibility

Sandcastle: >=0.12.0 <0.13.0
Package: @ai-hero/sandcastle
Entrypoint: .sandcastle/main.ts

Sandbox providers

docker

Environment variables

Variable	Required	Description
CLAUDE_CODE_OAUTH_TOKEN	required	Auth for the Claude Code searchers and synthesizer. Run `claude setup-token` on your host.
OPENAI_API_KEY	required	Auth for the Codex fact-checker.

Post-install notes

Edit RESEARCH_QUESTION and the SUBQUESTIONS list at the top of `.sandcastle/main.ts`, set CLAUDE_CODE_OAUTH_TOKEN and OPENAI_API_KEY in `.sandcastle/.env`, then run `npx @ai-hero/sandcastle docker build-image` and `npx tsx .sandcastle/main.ts`. The report lands at `research/REPORT.md`.

Versions

v1.0.1
v1.0.0Jul 4, 2026cccccccviewing

Source: runcastle/workflows