Compound engineering for the plan and review loop

5 min read · ai · tooling · code-review

The bottleneck has moved to review

A coding agent produces a plan in seconds and a diff in minutes. Reading them takes the rest of the session. Generation is cheap now; deciding what’s worth shipping is not.

It sharpens when two agents run at once. Claude Code in one terminal, Codex in another, both touching the same repo. Within minutes one is rebasing on top of the other’s uncommitted changes. A plan comes back as eighty bullet points to read in a terminal. A large diff lands, and a margin note that should have been raised before the build now costs a re-plan.

Three tools target the same loop. Each wraps a primitive that already exists — git worktrees, the agent’s own plan output, the brainstorming conversation — and stays out of the judgement itself. Plenty of agentic tooling automates that part with auto-approve and auto-merge. These go the other way. They cut the friction around reading and reviewing so the human pass happens on every change, not the changes you remember to look at.

Parallel branches with worktrunk

Two agents in the same checkout collide fast. Git worktrees solve it: one branch per directory, all sharing one object store. worktrunk (binary wt) is a thin Rust CLI that makes the lifecycle ergonomic enough to actually use:

wt switch -c add-search   # create branch + worktree, cd into it
wt list                   # show every active worktree
wt merge main             # merge current into main, auto-clean
wt remove                 # remove worktree, delete branch if merged

Vanilla git is git worktree add ../foo feature/foo && cd ../foo && ... followed by manual cleanup of the worktree, the branch, and the directory. wt collapses each step into one verb, and the shell integration actually changes your working directory.

Each Claude Code or Codex session gets its own worktree, with its own installed dependencies, its own dev server port, its own dirty state:

$ wt list
* main           /Users/me/repo
  add-search     /Users/me/repo-add-search       (claude-code)
  fix-tokens     /Users/me/repo-fix-tokens       (codex)
  prep-release   /Users/me/repo-prep-release

When a branch merges, wt remove deletes the worktree and the branch in one step. The discipline is to keep wt list empty of merged branches. Stale worktrees pile up fast and bring back the collisions the worktrees were meant to prevent.

Spec-to-plan with superpowers

A plan is only as good as the spec it came from. A bullet list assembled from a one-line prompt isn’t a spec.

superpowers ships a methodology rather than a tool. The skills auto-trigger when you start describing a feature — you don’t invoke them by name. Four matter for this loop:

  • brainstorming — runs before any creative work. Teases a spec out of the conversation in chunks short enough to read, instead of jumping to code.
  • writing-plans — turns the signed-off spec into an implementation plan structured for TDD, with each step narrow enough that a junior could follow it.
  • executing-plans — runs the plan in a separate session with review checkpoints between steps.
  • subagent-driven-development — fans independent steps out to subagents so the main session keeps its context clean.

A writing-plans output looks something like this:

## Plan: Add full-text search to blog

1. Add `fuse.js` dependency
2. Create `SearchIndex.astro` that builds a JSON index at build time
3. Create `SearchBox.svelte` — input field, debounced query, result list
4. Wire `SearchBox` into the header layout
5. Add test: build succeeds, index contains all non-draft posts

Each step is narrow enough to review in isolation and small enough that a wrong step costs one revision, not a re-plan.

Plan review with plannotator

A wrong abstraction, a missed edge case, intent read backwards. These show up in the plan. They’re cheap to fix only before the 600-line diff lands.

plannotator renders a Claude Code plan as a local webpage you can annotate in the margin, then ships your feedback back to the agent. It installs as a plugin, so the surface is slash commands:

/plannotator-review     # annotate a PR diff
/plannotator-annotate   # annotate a plan markdown
/plannotator-last       # annotate the last rendered assistant message
/plannotator-archive    # browse saved plan decisions

A hook picks up plans automatically when the agent enters plan mode — no manual export. Margin comments come back as a follow-up prompt in the same session, so a note like “split this step into a separate PR” reaches the agent without you retyping anything. The archive keeps every annotated plan around, which turns the review pass into something you can revisit when a decision later looks wrong.

How they compose

The output of each tool is the input to the next.

You describe a feature. brainstorming asks three rounds of questions and produces a signed-off spec. writing-plans turns that spec into a five-step plan. plannotator opens the plan in a browser; you annotate two steps, and the corrections ship back as a follow-up prompt. The agent revises the plan in the same session.

wt switch -c add-search creates an isolated worktree. The agent works through the revised plan with checkpoints between steps. A second agent can run on a different branch in the same repo without colliding. When the diff lands, plannotator reopens the same annotation surface for the PR — the review pass that started on the plan continues on the code.

What changes

Specs get written. The agent doesn’t jump to code; it draws out intent first. The plan reflects the spec, not the agent’s first guess.

Plans get read. The cheapest artifact takes the heaviest review pass. The diff arrives smaller, with fewer surprises.

Branches stay isolated. Two agents can work in parallel without rebasing on top of each other’s dirty state.

Judgement stays in the loop. Auto-approve removes the human from review entirely. These tools cut the friction around review so it actually happens.