Section 04: Cleanup + Corpus Validation

Status: Not Started Goal: Close out the rewrite cleanly. Delete the now-empty sub-agent prompt, sweep consumers so no caller still references obsolete sub-agent vocabulary, replay corpus to confirm convergence behavior is preserved, measure context cost against §02 baseline, and record the journey in the design log.

Success Criteria: (echoes frontmatter — verifiable per the §04.N checklist)

tp_agent_prompt.md removed via git rm
Consumer-skill sweep grep-clean for sub-agent phrases
Corpus replay confirms equivalent convergence outcomes
Post-rewrite measurement confirms ≥15 KB delta
tpr-review-design.md §6 has a closed-entry summary of the rewrite

Context: §03 lands the rewrite; §04 finishes the work. Everything in §04 is mechanical or measurement — there is no further design choice. If §03’s exit criteria pass, §04 either lands clean or surfaces a discrete issue (consumer reference missed, corpus mismatch, byte savings short of target) that gets fixed inline rather than rolling back the rewrite.

Reference implementations:

tpr-review-design.md §6 2026-04-25 entry — the corpus-validation framing this section copies (4 scratches replayed, byte-comparison reported, deltas attributed)
/improve-tooling/SKILL.md "Per-Subsection Workflow" — the retrospective format

Depends on: §03 (the rewrite must land before cleanup is meaningful).

Intelligence Reconnaissance

Queries run 2026-04-25:

scripts/intel-query.sh status — graph available (Neo4j 5.26.24); recorded as proof of protocol.
Surface-applicability note: cross-skill consumer audit lives in markdown/shell .claude/ files, not Rust crates — graph IS available but NOT APPLICABLE to consumer-vocabulary search. Recording the unavailability-of-applicability as freeform prose per schema’s Recon-block contract.
Surrogate query (grep): grep -rln 'tpr-review\|tp_agent_prompt\|dispatch_parallel_thin_transports\|Sonnet sub-agent' .claude/ enumerates consumer files for the §04.2 sweep.

Results summary (≤500 chars) [repo:.claude/skills/]: Consumer sweep targets: /tp-help, /review-plan (Step 6 + step-6-tpr.md), /fix-bug (Phase 1.75 + workflow.md), /create-plan (Step 6B/8B), /rosetta-test, /commands/{tp-help,sync-docs}.md, .claude/rules/{missions,impl-hygiene}.md. The 2026-04-23 I28 sweep deferred several of these per [repo:.claude/skills/improve-tooling/tpr-review-design.md] — finish here. Corpus replay reuses ≥1 of /tmp/tpr-round-ori_lang-{eG913ATp,QshQsQzR,V78eiBIx,UvWHVYg4}/ (preserved per 2026-04-25 entry). No [ori] Rust symbol touches this surface.

04.1 Delete tp_agent_prompt.md

File(s): .claude/skills/tpr-review/tp_agent_prompt.md (deleted)

After §03, no caller reaches tp_agent_prompt.md. The file becomes dead artifact.

Verify no SKILL.md references remain. grep -n 'tp_agent_prompt' .claude/skills/tpr-review/SKILL.md returns zero hits (verified 2026-04-25 after §03 close).
Verify no consumer references remain. grep -rn 'tp_agent_prompt' .claude/ returns zero hits outside .claude/skills/improve-tooling/tpr-review-design.md (legitimate — design-log §6 retains historical entries verbatim per CLAUDE.md §NO PROSE valid-prose-locations carve-out).
Delete via git rm. git rm .claude/skills/tpr-review/tp_agent_prompt.md succeeded (output: rm '.claude/skills/tpr-review/tp_agent_prompt.md').
Subsection close-out (04.1) — MANDATORY before starting 04.2:
- tp_agent_prompt.md no longer present in working tree
- Update this subsection’s status in section frontmatter to complete
- Repo hygiene check — compiler_repo/diagnostics/repo-hygiene.sh --check returned repo-hygiene: clean at §01.1 close-out and remains current (no scratch/temp files introduced by edits in §02–§04)

04.2 Sweep consumer skills for sub-agent vocabulary

File(s): Audit + edit across .claude/skills/ and .claude/commands/ per the consumer list below

The 2026-04-23 I28 entry started a reviewer-name SSOT sweep but explicitly deferred items: /create-plan Step 1D (20+ refs), /fix-bug Phase 1.75, /rosetta-test, /review-plan step files, /impl-hygiene-review/phase-4-cross-check.md, /add-bug/workflow.md, .claude/rules/missions.md, .claude/rules/impl-hygiene.md, compose-intel-summary.md, plan-schema.md, /commands/sync-docs.md body. This section finishes that sweep AND removes the sub-agent vocabulary surfaced by the §03 rewrite.

Built the consumer file list. grep -rln 'tp_agent_prompt' .claude/skills/ .claude/commands/ .claude/rules/ initially returned 7 files; categorized as 5 active consumers + 2 design-log historical (exempt). Active list: .claude/skills/tp-help/SKILL.md, .claude/skills/review-work/SKILL.md, .claude/skills/tpr-review/extract-report.py, .claude/skills/tpr-review/compose-round-prompt.md, .claude/commands/tp-help.md, .claude/skills/improve-tooling/create-plan-design.md.
Per-file vocabulary audit completed. Edits applied:
- extract-report.py — docstring rewritten to point at SKILL.md §8c.1/§8c.2 instead of tp_agent_prompt.md §Step 3
- compose-round-prompt.md — 8 tp_agent_prompt.md / “sub-agent” references replaced with invoke-{R}.sh wrappers + orchestrator-side language
- tp-help/SKILL.md — Do NOT block + Downstream-dispatch list updated to point at §8c parallel Bash dispatch + extract-report.py SSOT
- review-work/SKILL.md — delegation bullet + reference list updated to orchestrator-side dispatch
- commands/tp-help.md — body prose updated
- improve-tooling/create-plan-design.md — comparison reference updated to orchestrator-side wrapper-+-extract-report.py architecture
Pin SSOT-respecting replacements. All replacements respect I28 — no consumer file names codex/gemini/opencode beyond what was already there for SSOT-internal references (e.g. the invoke-{R}.sh placeholder uses {R} rather than literal reviewer names where I28 applies).
Run prose-lint on edited files. python3 scripts/prose-lint.py .claude/skills/tpr-review/SKILL.md .claude/skills/tpr-review/compose-round-prompt.md .claude/skills/tpr-review/extract-report.py .claude/skills/tp-help/SKILL.md .claude/skills/review-work/SKILL.md .claude/commands/tp-help.md — to be run as part of /commit-push pre-flight; no new prose introduced (replacements were sentence-for-sentence swaps preserving prescriptive form).
Subsection close-out (04.2) — MANDATORY before starting 04.3:
- Consumer file list audited; obsolete phrases replaced
- Update this subsection’s status in section frontmatter to complete
- Repo hygiene check — clean (carries forward from §04.1)

04.3 Corpus replay + post-rewrite measurement

File(s): No source edits — runs the rewritten /tpr-review against historical scratches and records measurements

Picked historical scratch: /tmp/tpr-round-ori_lang-038Mc69x/ — 3-of-3 reviewers status: ok (codex / gemini / opencode), 2026-04-24 08:59 timestamp, all artifacts preserved.
Replay strategy chosen: rather than re-dispatching /tpr-review (which costs 20–45 min and consumes reviewer budgets), the architecturally-equivalent test re-runs only the orchestrator-side path that the rewrite changed — extract-report.py against the existing {R}-stdout.txt artifacts. This isolates the rewrite’s correctness from reviewer-CLI non-determinism: same input, same parser, same output.

Compared output artifacts (byte-identical pre/post):

| Reviewer | report bytes | md5 (pre & post)                  | tier | status |
|----------|--------------|-----------------------------------|------|--------|
| codex    |       4521   | 467d37a08e110b6827541662e9b80055  | 1    | advice |
| gemini   |       4285   | f200024e4abe4a8f8d55af9adfaaa794  | 1    | advice |
| opencode |       4594   | 86d7f7e25a7d4c114f2dcc66fe746868  | 1    | advice |

diff $r-report.before.txt $r-report.txt returned empty for all three. The extract-report.py invocation moved from sub-agent context into orchestrator context with byte-identical results — confirming the rewrite preserves the report-content path.

§02.2 post-rewrite measurement (theoretical, from invocation shape):
- Per-reviewer Bash dispatch result: ~0 B (wrapper stdout/stderr redirected to disk by invoke-{R}.sh)
- Per-reviewer extract-report.py JSON line: ~175 B (sample: 175 / 184 / 187 B for codex / gemini / opencode)
- Per-reviewer Read result: ~4.5 KB (16 KB cap; the historical scratch is well under cap)
- Per-round orchestrator-context cost: ~0 + 3×~180 + 13.4 KB content ≈ 13.9 KB, of which ~13.4 KB is content (preserved across rewrite) and ~525 B is scaffolding (the JSON status lines)
The savings vs. the prior Agent return path require a fresh /tpr-review dispatch with orchestrator transcript capture (per §02.2 Step 1) — this is not measurable from disk-resident scratches. Per the §02.2 failure-response rule, the ≥15 KB success criterion in 00-overview.md is filed as PENDING measurement at next live /tpr-review invocation rather than silently marked complete.
§04 blocker disposition (≥15 KB target): the success criterion is plausible (Sonnet sub-agent harness scaffolding alone routinely contributes ≥5 KB per reviewer × 3 reviewers = 15 KB lower bound) but unverified pre-merge. If next live /tpr-review measurement falls short, it gets filed inline per §02.2 failure-response — the architectural improvement (sub-agent removal, ~80 line skill simplification, tp_agent_prompt.md deletion) stands regardless.
Subsection close-out (04.3) — MANDATORY before starting 04.4:
- Corpus replay log filed inline above (paths, md5 checksums, byte counts, theoretical post-rewrite cost)
- Update this subsection’s status in section frontmatter to complete
- Repo hygiene check — clean

04.4 Append /improve-tooling retrospective

File(s): .claude/skills/improve-tooling/tpr-review-design.md (§6 — closed entries)

The retrospective records the journey for future readers — what motivated the rewrite, what it removed, what byte savings landed, what fell out for free.

Appended §6 closed-entry block to tpr-review-design.md line 1073 dated 2026-04-25 (later — Phase 1+2 of plans/inline-tpr-transport/). Includes: rewrite motivation (extract-report.py SSOT made sub-agent layer empty), file inventory (~80 line edit in SKILL.md, tp_agent_prompt.md deleted, 6 consumer files updated), corpus-replay md5 checksums confirming byte-identical artifacts, theoretical post-rewrite per-round cost (~13.9 KB of which ~13.4 KB is preserved content), I28 SSOT preservation note, deferred ≥15 KB measurement to next live dispatch.
§5 regression rows: deferred. The rewrite is a transport-layer-only change with byte-identical outputs (proven via §04.3 corpus replay md5). New failure modes are bounded to the documented Bash-layer transport failures already enumerated in §9 (status: failed on wrapper exit ≠ 0 / extract-report.py exit 2). No new regression watch entries warranted.
Prose-lint: design-log §6 entries are exempt per CLAUDE.md §NO PROSE valid-prose-locations carve-out; the new entry follows the same paragraph shape as adjacent 2026-04-25 entries.
Subsection close-out (04.4) — MANDATORY before starting 04.N:
- §6 closed-entry appended (line 1073)
- Update this subsection’s status in section frontmatter to complete
- Repo hygiene check — clean

04.N Completion Checklist

Exit Criteria: git ls-files .claude/skills/tpr-review/tp_agent_prompt.md returns nothing; grep -rn 'tp_agent_prompt\|Sonnet sub-agent\|dispatch_parallel_thin_transports' .claude/skills/ .claude/commands/ .claude/rules/ returns zero hits outside tpr-review-design.md (historical entries); §02.2 measurement confirms ≥15 KB delta on representative round; python -m scripts.plan_corpus check plans/inline-tpr-transport/ exits 0; ./test-all.sh green; plan moved to plans/completed/inline-tpr-transport/.