Running Plans

Monitor agent sessions, handle failures, merge results, and manage worktrees.

Monitoring with tmux

By default, workbench runs agents in tmux sessions for live monitoring. You can attach to any agent session to watch it work:

tmux attach -t wb-task-1-implementor
tmux attach -t wb-task-2-tester

Sessions are named wb-task-<N>-<role> where role is one of implementor, tester, reviewer, fixer, or merger.

To run without tmux, use the --no-tmux flag:

wb run plan.md --no-tmux

This runs agents as subprocesses instead. Output is still captured but you can't attach to individual sessions.

Selective Execution

Plans don't have to run end-to-end. Workbench can target individual waves, ranges, or specific tasks. Most of these commands accept a plan name (the folder slug under .workbench/) in place of a path — wb run myplan resolves to .workbench/myplan/plan.md.

Wave subsets

# Run only wave 2
wb run myplan -w 2
 
# Run a contiguous range of waves
wb run myplan --start-wave 2 --end-wave 4
 
# Start from wave 3 and run through the end
wb run myplan --start-wave 3

Task subsets

# Single task by id
wb run myplan --task task-2
 
# Multiple tasks (flag repeats)
wb run myplan --task task-1 --task task-3
 
# Match by slug (lowercased task title)
wb run myplan --task add-jwt-auth

--task accepts task IDs or slugs and is repeatable. Only the named tasks run; the rest of the plan is skipped.

Skip completed tasks

wb run myplan -b workbench-1 --only-incomplete

--only-incomplete reads the session's status file and skips any task that already finished. It requires -b to specify which session branch to continue.

See CLI Reference for the full list of flags that wb run accepts.

Handling Failures

If some tasks fail, you have several options:

Resume a session

The easiest way to resume an interrupted or partially-failed session:

wb resume workbench-1

This looks up the session in .workbench/<plan>/status.yaml (legacy .workbench/status-*.yaml is still read), finds the original plan, and re-runs every task that isn't done + merged — including tasks that failed and tasks that never started. Each task resumes from its last successfully completed pipeline stage rather than starting the full pipeline over.

For finer control (wave ranges, directive overrides, selective tasks), pair -b <session> with wb run flags directly:

# Same effect as wb resume, but lets you override flags
wb run plan.md -b workbench-1 --only-incomplete
 
# Continue an existing session from a specific wave
wb run plan.md -b workbench-1 --start-wave 3
 
# Retry only the tasks that crashed
wb run plan.md -b workbench-1 --retry-failed

Retry failed tasks

# Auto-retry tasks that crashed (not those that exhausted fix retries)
wb run plan.md --retry-failed
 
# Stop immediately if any task in a wave fails
wb run plan.md --fail-fast
 
# Combine: retry crashes, then stop if still failing
wb run plan.md --retry-failed --fail-fast

--retry-failed distinguishes between transient failures (agent crash, timeout) and deliberate failures (exhausted all fix cycles). Only transient failures are retried.

When retrying, workbench resumes each task from its last successfully completed pipeline stage rather than starting the full pipeline over. For example, if a task finished implement and test but crashed during review, the retry picks up at review — skipping the stages already done. This uses the existing worktree and branch from the prior run. Workbench falls back to a clean start if the worktree no longer exists (e.g., after wb clean). For TDD tasks, workbench only reuses the existing branch when both TDD phases (write-failing-tests and implement) completed in the prior run — a partial implement stage may have left broken commits, so workbench falls back to a clean start to give the TDD implementor a reliable baseline.

--only-incomplete reads the plan's status file to determine which tasks already completed. It requires -b to specify the session branch to resume.

Re-run specific tasks

To replay only failing or specific tasks in an existing session, combine -b with --task (see Selective Execution above):

wb run plan.md -b workbench-1 --task task-2
wb run plan.md -b workbench-1 --task task-1 --task task-3

Merging Results

When a plan completes, all changes are on a session branch (e.g. workbench-1). This branch contains the merged results of every task across all waves.

# Merge into your current branch
git merge workbench-1
 
# Or merge into main
git checkout main
git merge workbench-1

Merging interrupted sessions

If a run was interrupted or some merges failed due to conflicts, use wb merge to attempt merging without re-running pipelines:

wb merge -b workbench-1

This scans the session's status file (.workbench/<plan>/status.yaml), finds tasks with status=done that haven't been merged yet, and attempts each merge. Conflicts are handled by a merge resolver agent. Branches that were already merged manually are detected and skipped.

You can specify which agent handles conflict resolution:

wb merge -b workbench-1 --agent antigravity
wb merge -b workbench-1 --no-tmux

Pushing to origin

Use --push on either wb run or wb merge to push the session branch to origin after all merges complete:

wb run plan.md --push
wb merge -b workbench-1 --push

This sets upstream tracking so subsequent git push commands work without additional flags.

Branching Strategy

When you run wb run plan.md, workbench creates this branch structure:

main (or --base branch)
 └── workbench-N (or --name)         ← session branch
      ├── wb/task-1-short-title       ← worktree branch for task 1
      ├── wb/task-2-another-task      ← worktree branch for task 2

Each task gets its own branch and worktree. After a wave completes, successful task branches are merged into the session branch. The next wave branches from the updated session branch.

Session branches are created without upstream tracking by default. Use --push to set upstream tracking when pushing to origin.

Flag	Session branch	Base	Source	Use case
(default)	`workbench-N`	`main`	`origin/main` (fetched)	Start from latest remote
`--name my-feature`	`my-feature`	`main`	`origin/main` (fetched)	Named session branch
`--local`	`workbench-N`	`main`	local `main`	Build on unpushed local work
`--base <branch>`	`workbench-N`	`<branch>`	`origin/<branch>` (fetched)	Branch from a specific remote branch
`--base <branch> --local`	`workbench-N`	`<branch>`	local `<branch>`	Branch from a local feature branch
`-b my-session`	`my-session`	(existing)	(existing)	Resume a previous session

Final Review and PR Creation

After a session's branches are merged, you can run a whole-branch review that reads the plan, extracts requirements, and evaluates the session branch against them. On VERDICT: PASS, workbench opens a GitHub PR automatically.

See Review & Pull Requests for the dedicated guide to the requirements → review → PR pipeline.

Trigger from `wb run`

wb run plan.md --final-review
wb run plan.md --final-review --push       # push branch and open PR on PASS
wb run plan.md --final-review --skip-pr    # run review but never open a PR

Trigger from `wb merge`

wb merge -b workbench-1 --review
wb merge -b workbench-1 --review --pr-title "Add JWT auth"

Run standalone

wb final-review workbench-1               # run review for a completed session
wb review workbench-1                     # alias
wb final-review workbench-1 --skip-pr     # review without opening a PR

How it works

Summarizer — reads the plan and extracts a structured requirements digest (requirements.md).
Branch reviewer + fixer loop — the reviewer diffs the session branch against the digest and emits a verdict. On VERDICT: FAIL, a branch fixer applies the findings and the reviewer re-evaluates. This repeats until a VERDICT: PASS or --max-retries (default: 2) is exhausted.
On VERDICT: PASS, the review branch is merged back into the session branch, then a PR writer agent authors the PR title and body from the actual diff, commit log, and plan context.

Supply --pr-body-file to skip the PR writer and use a static body instead.

Artifacts are saved to .workbench/<plan-id>/wrap-up/<session>/ (requirements.md, review.md, pr-body.md). wb status shows the latest verdict per session and the PR URL (or review path on fail).

PR override options

Flag	Description
`--pr-title TEXT`	Override the PR title (default: plan H1, then plan id)
`--pr-body-file PATH`	Use this file's content as the PR body (skips PR writer agent)
`--pr-base BRANCH`	Override the PR base branch (default: session's recorded base)
`--skip-pr`	Skip PR creation even on PASS verdict
`--pr-writer-directive TEXT`	Override the PR writer agent's instructions

gh must be installed and authenticated for PR creation. If it isn't available, the review still runs and a copy-pasteable gh pr create command is printed instead.

Open a PR without running a review

Use wb pull-request to create a PR directly for a completed session, without running the requirements review. The PR writer agent still authors the title and body from the diff:

wb pull-request myfeature
wb pull-request myfeature -b workbench-3    # if plan has multiple sessions

This is useful when the session branch is already reviewed and merged, or when you want to open a draft PR before running a formal review.

Cleanup

Workbench creates isolated git worktrees for each task. These are usually cleaned up automatically, but if any remain:

wb clean

This removes all workbench-created worktrees and wb/ branches. To also stop any running agent sessions:

wb stop --cleanup