Section 03: Findings Report & Write-Back
Status: In Progress
Goal: Design the findings report format and implement the write-back mechanism that auto-fixes safe issues and flags issues requiring human decision. Connect the output to /continue-roadmap so cross-plan conflicts surface during active roadmap work.
Success Criteria:
- Safety taxonomy (
SafetyClass,ClassifiedFinding,WriteBackContext) defined and tested - Findings report format defined and implemented (JSON + markdown + console)
- Frontmatter text patcher operates on raw text (regex), never PyYAML dump/reload
- Auto-fix engine handles safe issues without human intervention, with concurrent-session guards
- Manual-review issues are flagged with clear context and recommended actions
- Integration with
/continue-roadmapsurfaces findings during active work
Context: Sections 01 and 02 produce raw findings (schema violations, DAG conflicts, priority inversions). This section turns those findings into actionable output: a structured report for review, an auto-fix engine for safe corrections, and integration with the existing /continue-roadmap workflow so findings surface at the right time. The distinction between auto-fixable and manual-review issues is critical — auto-fixing frontmatter field renames is safe; auto-resolving goal conflicts between plans is not.
Depends on: Section 02 (DAG Builder) — the report format depends on the classifier output structure.
Architectural decisions:
-
PyYAML is read-only. PyYAML
safe_loadis used to PARSE frontmatter. It is NEVER used to WRITE frontmatter back. PyYAMLdumpdestroys YAML comments (which are DAG signal per Section 02’s HTML_COMMENT_CONVENTION and YAML_COMMENT source kinds), reorders keys, strips trailing whitespace, normalizes quoting style, and flattens multi-line strings. All frontmatter writes go through the targeted text patcher (03.4) which operates on the raw text slice between the---fences using line-level regex replacements. This is the ONLY safe write path. See also:roadmap_scan.py:344(yaml.safe_loadfor read) — the same constraint applies there. -
Safety taxonomy lives here, not in plan_corpus.
SafetyClass,ClassifiedFinding, andclassify_safetyare write-back policy. Theplan_corpuslibrary produces factualFindingrecords (no policy); this section consumes those findings and classifies them for write-back.plan_corpusmust never import from this module. -
Concurrent-session safety is mandatory. The user runs parallel agent sessions with uncommitted work (see MEMORY.md
feedback_never_destructive_git.md). Any read-modify-write on plan files must: (a) record a preimage hash at scan time, (b) re-read and hash-compare before write, (c) write to a temp file andos.replaceatomically. If the file changed between scan and write, refuse to apply and log the conflict.
03.1 Safety Taxonomy & Data Types
File(s): New module in the verify-roadmap skill (e.g. scripts/verify_roadmap/safety.py or inline in the skill’s write-back logic)
Define the safety taxonomy data types that the report format (03.2) and auto-fix engine (03.3) both consume. This subsection exists to break the circular dependency identified by tp-help: the report format needs ClassifiedFinding to serialize, and the auto-fix engine needs SafetyClass to gate writes — both need the types before either can be implemented.
-
Define
SafetyClass(Enum):SafeFix | ExposureReview— the auto-fix gating tag:SafeFixfindings are applied automatically (with backup + log)ExposureReviewfindings are surfaced for human review (never auto-applied)
-
Define
ClassifiedFindingdataclass:- Fields:
finding: Finding,safety_class: SafetyClass,rationale: str - Wraps a plain
Finding(imported fromplan_corpus; NOsafety_classon theFindingitself per 01.3) - Section 03 produces
ClassifiedFindingrecords; Sections 01/02 never do
- Fields:
-
Define
WriteBackContextdataclass:- Field:
has_recent_commits: dict[Path, bool]— maps plan directories to git activity signal - The CLI front-end populates this by running
git log --since=14d -- plans/<name>/at the edge plan_corpusstays pure — grep-verify it contains nosubprocessorgitcalls--quickmode optimization (blind spot #10):WriteBackContextconstruction requires O(N)git logsubprocess calls per plan.--quickmode runs only read-only DAG checks (BLOCKED, DEAD_REFERENCE) which do not need git signals.--quickMUST bypassWriteBackContextpopulation entirely by passingcontext=Noneto the report generator.classify_safetyin--quickmode skips classification and marks all findings as ExposureReview (report-only, no auto-fix). This is a correctness optimization, not just performance —--quickis a pre-check, not a write-back trigger.
- Field:
-
Define
PreimageRecorddataclass (concurrent-session guard):- Fields:
path: Path,content_hash: str,scan_timestamp: float content_hashishashlib.sha256(path.read_bytes()).hexdigest()- Captured at scan time for every file that might be modified
- Used by the text patcher (03.4) to detect concurrent modifications before write
- Fields:
-
Implement
classify_safety(finding: Finding, context: WriteBackContext | None, frontmatter_data: dict | None = None) -> ClassifiedFinding:- Signature note (TPR-03-001-gemini): the
frontmatter_dataparameter carries the parsed frontmatter dict for the finding’s source file. This allowsclassify_safetyto inspect sibling fields (e.g., checking whether bothplan:andname:exist for the collision guard) WITHOUT performing I/O — the dict is pre-parsed byplan_corpus.parserat scan time. The function remains pure:(finding, context, dict) -> ClassifiedFinding. - When
context is None(—quick mode): returnClassifiedFinding(finding, ExposureReview, "quick mode — no write-back classification") - When
contextis provided: dispatch onfinding.category+finding.subtype:
SCHEMA_VIOLATION subtypes — SafeFix:
- Field rename
plan:->name:— NOTE:OverviewSchemacanonically usesplan:(seeschemas.py:89-91);PlanIndexSchemacanonically usesname:(seeschemas.py:39). Renamingplan:toname:is ONLY valid on files where the schema expectsname:but the file hasplan:instead (i.e., the file is aPlanIndexSchemafile misusingplan:). SafeFix ONLY when:- The target file’s schema class is
PlanIndexSchema(the schema that requiresname:) AND the file hasplan:instead ofname: - NOT valid on
OverviewSchemafiles — those canonically useplan:as a required field; renaming it toname:would violate the schema - Collision guard (blind spot #3): if the file already has BOTH a
plan:key AND aname:key with DIFFERENT values, this is ExposureReview (human must decide which value to keep). Check usesfrontmatter_dataparameter:"plan" in frontmatter_data and "name" in frontmatter_data and frontmatter_data["plan"] != frontmatter_data["name"]— no I/O needed, dict is pre-parsed - If
plan:exists andname:does not (on a PlanIndexSchema file), SafeFix: rename key preserving value byte-for-byte - If both exist with identical values, SafeFix: remove the
plan:key (redundant) - Paired-finding deduplication (TPR-03-002-gemini): When
plan:is used instead ofname:,plan_corpus.schemaemits TWO findings:UNKNOWN_FIELD: planANDMISSING_REQUIRED_FIELD: name. The rename SafeFix resolves BOTH. The auto-fix dispatcher (03.3) must deduplicate these: when aplan:→name:rename is applied, mark the pairedMISSING_REQUIRED_FIELD: namefinding as resolved-by-sibling (do NOT surface it as a separate ExposureReview). Add aresolved_by_sibling: Finding.id | Nonefield toClassifiedFindingfor this case.
- The target file’s schema class is
- Removing
reroute: false— SafeFix (default-equivalent value) - Adding missing
reviewed: falsedefault — SafeFix ONLY forPlanSectionSchemaandRoadmapSectionSchemawherereviewed: boolis a REQUIRED field with no default (seeschemas.py:62,76). Workflow behavior guard (blind spot #7): forPlanIndexSchemawherereviewed: bool | None = Noneis OPTIONAL, auto-insertingreviewed: falseis ExposureReview because it triggers the/continue-roadmapStep 1.7 unreviewed-plan gate (seeSKILL.md:205-218). The absence of the field means “no review state” (None), which does NOT trigger the gate;falseactively triggers it. This is a semantic change, not normalization. - Adding missing
third_party_review: {status: none, updated: null}— SafeFix where the field is required by schema (PlanSectionSchema,FixBugSchema)
SCHEMA_VIOLATION subtypes — ExposureReview:
MISSING_REQUIRED_FIELDwhen the missing field needs semantic inference from body content (e.g. missing frontmatter entirely — reconstructing canonical frontmatter from body content is semantic inference, not normalization)
STATUS_CONTRADICTION subtypes:
PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED— SafeFix IFFcontext.has_recent_commits[plan_dir] == False(no activity supports status=queued); else ExposureReview (recent commits suggest the plan IS actively being worked on but sections are stale — needs human)FM_DECLARED_VS_BODY_DERIVED— ALWAYS ExposureReview (blind spot #4). The normalizer (normalizer.py:155-159) returnsderived="complete"whenhas_complete_marker is Trueeven whenunchecked > 0(aspirational COMPLETE marker with remaining work). Auto-fixing status tocompletebased on this derivation is WRONG — it would mark plans as complete when they have unchecked checkboxes. The normalizer intentionally returns “complete” to trigger theFM_DECLARED_VS_BODY_DERIVEDfinding; the finding itself is the signal that human review is needed, not that auto-fix should proceed. The auto-fix engine MUST NOT override the ExposureReview classification for this subtype.PLAN_COMPLETE_WITH_OPEN_SECTIONS— ExposureReview (semantic decision: complete open sections or downgrade plan status)- All other
STATUS_CONTRADICTIONsubtypes — ExposureReview by default (conservative)
DEAD_REFERENCE subtypes — SafeFix (frontmatter only):
PLAN_DIRECTORY_NOT_FOUND/SECTION_FILE_NOT_FOUND/CROSS_PLAN_NAME_NOT_FOUNDwhen the dead reference is in adepends_onfrontmatter list entry (mechanical removal from a YAML list). Prose body references are ALWAYS ExposureReview (human-authored replacement may be needed)SPEC_FILE_NOT_FOUND— ExposureReview (NOT SafeFix). Thespec:field lives onRoadmapSectionSchema(schemas.py:81) and references spec file paths. A dead spec reference may indicate a spec file was renamed or reorganized — the correct target needs human determination. Unlikedepends_onentries where removal is mechanical, a missing spec file may need a replacement path, not deletion.- Audit trail guard (blind spot #8): dead-reference removal audit trail goes to
build/verify-roadmap/fixes-applied.json, NOT as inline HTML comments. An inline<!-- Removed dead reference to plans/X/ -->comment would be re-scanned by Section 02’s HTML_COMMENT_CONVENTION parser and produce false positive MISSING_DEPENDENCY findings in future runs. Thefixes-applied.jsonlog is the audit trail.
All other categories:
-
PARSE_ERROR,DAG_CONFLICT,ITEM_VERIFICATION,GAP— ExposureReview by default (conservative; never auto-applied). The default branch MUST record the rationale"no SafeFix rule declared for <category>/<subtype>" -
Each
ClassifiedFindingcarries arationalestring explaining why it got its class -
Pure function of
(finding, context)— no I/O insideclassify_safetyitself
- Signature note (TPR-03-001-gemini): the
-
Tests (TDD — write before implementation):
- Matrix: every
(FindingCategory, FindingSubtype)pair intypes.py:_CATEGORY_SUBTYPESmust have a test case asserting its safety classification - Semantic pins:
FM_DECLARED_VS_BODY_DERIVED-> ExposureReview (pin: revert to SafeFix -> test fails) - Semantic pins:
PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTEDwithhas_recent_commits=True-> ExposureReview - Semantic pins:
PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTEDwithhas_recent_commits=False-> SafeFix - Negative pins:
classify_safetywithcontext=NoneMUST return ExposureReview for every finding - Collision guard pin:
plan:->name:rename when both keys exist with different values -> ExposureReview - Collision guard pin:
plan:->name:rename when both keys exist with same values -> SafeFix (removeplan:) - Workflow behavior pin:
reviewed: falseinsertion on PlanIndexSchema -> ExposureReview - Workflow behavior pin:
reviewed: falseinsertion on PlanSectionSchema -> SafeFix
- Matrix: every
-
Subsection close-out (03.1) — MANDATORY before starting 03.2:
- All tasks above are
[x]and types + classify_safety tested - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection - Run
/sync-claudeon THIS subsection — check whether changes invalidated any CLAUDE.md,.claude/rules/*.md, orcanon.mdclaims. If no changes, document briefly. Fix any drift NOW.
- All tasks above are
03.2 Report Format
File(s): Report generation integrated into the verify-roadmap skill pipeline
Design and implement the findings report format. The report must be both human-readable (markdown) and machine-parseable (JSON) for downstream tool integration. This subsection CONSUMES the types defined in 03.1.
-
Import the finding data model from
plan_corpus(01.3 SSOT — do NOT redefine here):Finding={id, category, subtype, severity, source, source_line, source_column, target, target_line, description, recommended_fix, evidence, dependency_chain, source_kind}FindingCategoryandFindingSubtypeenums are imported (see Section 01.3 for the complete taxonomy)Finding.to_json()/Finding.to_markdown()are used as-is; Section 03 only wraps them into a report
-
Import
ClassifiedFindingandSafetyClassfrom 03.1 (local to this section’s module; NOT fromplan_corpus). The report serializesClassifiedFindingrecords — each entry includes the finding data PLUS the safety classification, rationale, and sibling resolution state. -
Implement JSON report output:
- Array of
ClassifiedFindingobjects: each hasfinding(theFinding.to_json()dict),safety_class("safe_fix"or"exposure_review"),rationale(string),resolved_by_sibling(Finding.id string or null — non-null when this finding was resolved as a side-effect of fixing a paired finding, e.g.,MISSING_REQUIRED_FIELD: nameresolved by theplan:→name:rename) - Written to
build/verify-roadmap/findings.json(build directory, not committed) - Include metadata header: timestamp, corpus size, classifier versions, mode (
--full/--quick) - When mode is
--quick, omitsafety_classandrationalefields (classification was not performed)
- Array of
-
Implement markdown report output:
- Grouped by severity (critical first, then high, medium, low)
- Within each severity, grouped by safety class (ExposureReview first, then SafeFix)
- Within each group, sorted by classifier type
- Each finding shows: type badge, source -> target, description, recommended fix, safety classification
- Summary table at top: count by type and severity, count by safety class
- Written to
build/verify-roadmap/findings.md(build directory, not committed)
-
Implement console summary output:
- One-line-per-finding format for terminal display
- Color-coded by severity (if terminal supports it)
- SafeFix findings marked with
[auto]prefix; ExposureReview with[review]; unapplied fixes marked with[UNAPPLIED](concurrent-modification refusal from PatchResult(applied=False)) - Exit code reflects findings: 0 = clean, 1 = findings present, 2 = critical findings
-
Unapplied-fix report surface (TPR-03-003-codex / TPR-03-002-gemini): The report format must surface
PatchResult(applied=False)results from the auto-fix engine as a distinct group in both JSON and markdown output. In JSON: add anunapplied_fixesarray alongside the main findings array. In markdown: add an “Unapplied Fixes” section after the main findings grouped by reason (concurrent modification, malformed file, etc.). These are NOT dropped — they represent intended work that could not safely complete. -
Tests (TDD):
- Round-trip test:
ClassifiedFinding-> JSON -> parse -> verify all fields preserved - Markdown grouping test: verify severity ordering, safety class ordering
- Exit code test: 0 for empty findings, 1 for low/medium, 2 for critical
--quickmode test: verify JSON output omits safety_class/rationale- Unapplied-fix surface test: verify that
PatchResult(applied=False)entries appear in theunapplied_fixesgroup in both JSON and markdown reports (not silently dropped)
- Round-trip test:
-
Subsection close-out (03.2) — MANDATORY before starting 03.3:
- All tasks above are
[x]and report generates correctly on current corpus - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection - Run
/sync-claudeon THIS subsection — check whether changes invalidated any CLAUDE.md,.claude/rules/*.md, orcanon.mdclaims. If no changes, document briefly. Fix any drift NOW.
- All tasks above are
03.3 Auto-Fix Engine
File(s): Auto-fix logic integrated into verification pipeline
Implement automatic fixes for findings classified as SafeFix by 03.1’s classify_safety. Safety criterion: a fix is auto-fixable if it cannot change plan semantics — only metadata normalization.
-
Implement auto-fix dispatcher:
- Input: list of
ClassifiedFindingrecords - Filter to
safety_class == SafeFixonly - For each SafeFix finding, dispatch to the appropriate fix handler based on
finding.category+finding.subtype - All fixes go through the text patcher (03.4) — the auto-fix engine NEVER writes files directly
- Input: list of
-
Implement auto-fix for SCHEMA_VIOLATION SafeFix findings:
- Field rename
plan:->name:(via text patcher: regex replace^plan:withname:in frontmatter slice; preserving value byte-for-byte) - Field removal:
reroute: false-> remove entire line from frontmatter slice - Default field insertion: add
reviewed: falsevia text patcher (insert line in frontmatter slice) — only for PlanSectionSchema/RoadmapSectionSchema files (see 03.1 workflow behavior guard) - Default field insertion: add
third_party_review:block — only for schemas where required
- Field rename
-
Implement auto-fix for STATUS_CONTRADICTION SafeFix findings:
PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED(when classified SafeFix by 03.1): changestatus: activetostatus: queuedin frontmatter via text patcher- NOTE:
FM_DECLARED_VS_BODY_DERIVEDis NEVER SafeFix (see 03.1). The auto-fix engine MUST assert that noFM_DECLARED_VS_BODY_DERIVEDfinding reaches the SafeFix dispatch — this is a defense-in-depth invariant. If it fires, the classifier has a bug. parallel: trueguard (from Section 01.2):parallel: trueis a VALID canonicalPlanIndexSchemafield. Auto-fix MUST NOT remove it. Verify no fix handler touches fields outside its explicit scope.
-
Implement auto-fix for DEAD_REFERENCE SafeFix findings:
- Remove dead
depends_onentries from frontmatter list via text patcher - Audit trail in
fixes-applied.jsononly (blind spot #8): do NOT add inline HTML comments like<!-- Removed dead reference to plans/X/ -->. Section 02’s HTML_COMMENT_CONVENTION parser scans forblocked-by,unblocks,supersedes,resolvespatterns in HTML comments. While a “Removed dead reference” comment does not match those verbs today, any future verb expansion or fuzzy matching would produce false positive MISSING_DEPENDENCY findings. Thefixes-applied.jsonlog is the permanent audit trail. - Do NOT auto-remove references from prose body text (might need human-authored replacement)
- Remove dead
-
Implement safe-fix guards:
- All auto-fixes create a backup of the original file in
build/verify-roadmap/backups/ - All auto-fixes are logged to
build/verify-roadmap/fixes-applied.jsonwith: finding ID, file path, fix type, before/after snippet, timestamp --dry-runflag shows what would be fixed without modifying files--no-auto-fixflag disables auto-fixing entirely (report-only mode)- Defense-in-depth: auto-fix engine MUST reject any finding that is not
SafeFix— this is a hard assert, not a silent skip. If anExposureReviewfinding leaks into the auto-fix path, it is a classifier bug and must fail loudly. - Concurrent-modification propagation (TPR-03-003-codex / TPR-03-002-gemini): when
apply_patchreturnsPatchResult(applied=False)(preimage hash mismatch from concurrent session), the auto-fix dispatcher MUST convert the originalSafeFixfinding into anExposureReviewfinding with the failure reason appended to the rationale (e.g.,"SafeFix reverted to ExposureReview: file modified by concurrent session") and append it to the final report as an unapplied fix. The report format (03.2) must surface these as a distinct “unapplied fixes” group — they represent work the tool intended to do but could not safely complete. They MUST NOT be silently dropped.
- All auto-fixes create a backup of the original file in
-
Define manual-review flagging for non-auto-fixable findings:
- CONFLICT findings: always manual — requires human decision on which plan’s goals take precedence
- SUPERSEDED findings: always manual — requires acknowledgment that a reroute claim is stale or completion of the reroute. §02 handoff note (TPR-03-005-codex): Section 02 defines a git-aware SUPERSEDED specialization with two structural cases (
section-02-dag-builder.md:251-252).classify_safetydeliberately routes ALL SUPERSEDED findings to ExposureReview (never SafeFix) because SUPERSEDED resolution is inherently semantic — the user must decide whether the reroute claim is valid, stale, or in progress.WriteBackContext.has_recent_commitsis available for future SafeFix graduation if a narrow, safe subcase is identified (e.g., “SUPERSEDED by a plan withstatus: resolved”), but no such subcase is implemented in this section. This is an explicit design decision, not an omission. - BLOCKED findings: always manual — requires plan reordering or dependency acknowledgment
- MISSING_DEPENDENCY findings: always manual — requires explicit dependency declaration or acknowledgment of independence
- All ExposureReview-classified findings: surfaced in the report with context and recommended actions
-
Tests (TDD):
- Semantic pin:
FM_DECLARED_VS_BODY_DERIVEDreaching auto-fix dispatcher -> assert/panic (defense-in-depth) - Semantic pin:
parallel: truefield untouched by any fix handler - Matrix: each SafeFix subtype has a test case verifying the correct text transformation
- Negative pin: ExposureReview finding passed to auto-fix dispatcher -> rejected
- Backup test: verify backup file created before modification
- Dry-run test: verify no file modifications in dry-run mode
- Idempotency test: running auto-fix twice on the same corpus produces identical results
- Semantic pin:
-
Subsection close-out (03.3) — MANDATORY before starting 03.4:
- All tasks above are
[x]and auto-fix engine tested on known cases - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection - Run
/sync-claudeon THIS subsection — check whether changes invalidated any CLAUDE.md,.claude/rules/*.md, orcanon.mdclaims. If no changes, document briefly. Fix any drift NOW.
- All tasks above are
03.4 Frontmatter Text Patcher
File(s): New module for targeted text-level frontmatter manipulation
This subsection implements the ONLY write path for frontmatter modifications. PyYAML is read-only; all writes go through targeted text patching on the raw frontmatter slice. This subsection also implements the concurrent-session safety guards.
Rationale (blind spot #1): PyYAML safe_load parses YAML into Python dicts (losing comments, key order, quoting style, trailing whitespace). If we were to modify the dict and yaml.dump it back, every comment in the frontmatter — including YAML comments that are DAG signal for Section 02’s YAML_COMMENT source kind — would be destroyed. Additionally, key ordering changes produce noisy git diffs. The text patcher operates on the raw text between the --- fences, using line-level regex replacements that preserve everything the fix does not explicitly target.
-
Implement
extract_frontmatter_slice(text: str) -> tuple[str, int, int]:- Returns
(frontmatter_text, start_offset, end_offset)— the raw text between---fences (exclusive of fences) - Uses the same boundary detection as
plan_corpus.parser.split_frontmatter_strict(exact fence regex fromtypes.py:FRONTMATTER_FENCE) — note: the actual API name issplit_frontmatter_strict, NOTsplit_frontmatter - Returns empty/zero on malformed files (no fences) — caller handles
- Returns
-
Implement per-fix-type text operations (all operate on the frontmatter slice string):
rename_key(fm_text: str, old_key: str, new_key: str) -> str— regex^{old_key}(\s*:.*)$->{new_key}\1(preserves value, spacing, inline comments)remove_key(fm_text: str, key: str) -> str— remove the entire line matching^{key}\s*:.*$(handles multi-line values by tracking indent)replace_value(fm_text: str, key: str, new_value: str) -> str— regex^({key}\s*:\s*).*$->\1{new_value}(preserves key formatting)insert_key(fm_text: str, key: str, value: str, after_key: str | None) -> str— insert{key}: {value}on a new line afterafter_key(or at end of frontmatter ifafter_keyis None)remove_list_item(fm_text: str, list_key: str, item_value: str) -> str— remove a single- "value"entry from a YAML list underlist_key, handling both inline[a, b]and block- a\n- blist styles
-
Implement
apply_patch(path: Path, fm_operations: list[FmOperation], preimage: PreimageRecord) -> PatchResult:FmOperation=(operation_type, **kwargs)matching the per-fix-type operations above- Concurrent-session guard (blind spot #6):
- Re-read
pathand computesha256(content) - Compare against
preimage.content_hash - If hashes differ: refuse to write, return
PatchResult(applied=False, reason="file modified since scan by concurrent session") - If hashes match: apply all operations to the frontmatter slice, reassemble full text, write to temp file (
path.with_suffix('.tmp')) viaos.replacefor atomicity
- Re-read
- Returns
PatchResult(applied: bool, reason: str, before_hash: str, after_hash: str)
-
Implement
reassemble_file(original_text: str, patched_fm: str, start_offset: int, end_offset: int) -> str:- Splice the patched frontmatter back into the original text at the correct offsets
- Preserve everything before
start_offsetand afterend_offset(including the---fences)
-
Shadow parser note (blind spot #5):
roadmap_scan.py(1462 lines) has its ownsplit_frontmatter,parse_section_file,parse_index_file(~600 lines of parsing logic). This is LEAK:algorithmic-duplication withplan_corpus. The text patcher MUST NOT introduce a third frontmatter parser. It usesplan_corpus.types.FRONTMATTER_FENCEfor boundary detection. The fullroadmap_scan.pyparser refactoring to importplan_corpusis tracked separately (it is a prerequisite for--quickmode correctness in 03.5, since/continue-roadmapand/verify-roadmap --quickmust agree on corpus parse results). Migration tracked as concrete- [ ]in §05.3 (L187: “roadmap_scan.py shadow parser migration”) with<!-- unblocks:03.5 -->. -
Tests (TDD):
- Semantic pin:
rename_keypreserves YAML comments on the same line (name: foo # this is important) - Semantic pin:
rename_keypreserves YAML comments on adjacent lines - Semantic pin:
remove_keyhandles multi-line YAML values (indented continuation lines) - Negative pin:
apply_patchrefuses write when preimage hash mismatches (concurrent modification) - Negative pin:
apply_patchrefuses write on malformed files (no frontmatter fences) - Atomicity test: interrupt during write -> original file intact (temp file may remain)
- Round-trip test (TPR-03-003-gemini):
extract -> modify -> reassemble -> parse with plan_corpus.parser(NOTyaml.safe_load) produces expected YAML dict — the strict parser must accept the patched output, not just a lenient YAML loader - Key ordering test: unmodified keys retain their original order after patch
- Comment preservation test: YAML comments (
# ...) and inline comments survive all operations remove_list_itemtest: both inline[a, b]and block- a\n- blist styles handled- Collision guard integration test:
plan:exists andname:exists with different values -> ExposureReview classification -> patcher never invoked
- Semantic pin:
-
Subsection close-out (03.4) — MANDATORY before starting 03.5:
- All tasks above are
[x]and text patcher tested with comment-preserving round-trips - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection - Run
/sync-claudeon THIS subsection — check whether changes invalidated any CLAUDE.md,.claude/rules/*.md, orcanon.mdclaims. If no changes, document briefly. Fix any drift NOW.
- All tasks above are
03.5 Continue-Roadmap Integration
File(s): .claude/skills/verify-roadmap/SKILL.md, integration with roadmap-scan.sh
Integrate the findings report with /continue-roadmap so cross-plan conflicts surface during active roadmap work, not only during explicit /verify-roadmap runs.
-
Add a lightweight cross-plan check to roadmap-scan.sh:
- Before
/continue-roadmapselects the next section to work on, run a fast subset of the DAG analysis - Check whether the selected section has BLOCKED or DEAD_REFERENCE findings (the two classifiers included in
--quickmode — NOT CONFLICT, which requires O(N^2) shared-subsystem analysis) - If findings exist, display them before proceeding and let the user decide whether to continue or switch to resolving the finding
- Before
-
Design the integration interface — resolve scope contradiction (blind spot #9):
- The verify-roadmap skill exposes a
--quickmode that runs ONLYBLOCKEDandDEAD_REFERENCEchecks (fast, no shared-subsystem analysis, no git signal population per 03.1) - Explicitly NOT included in
--quick: CONFLICT (requires shared-subsystem analysis which is O(N^2)), STATUS_CONTRADICTION (requires body scanning), SUPERSEDED (requires reroute resolution), MISSING_DEPENDENCY (requires full prose scan) - The full mode (
--full) runs all classifiers from Sections 01-02, runsclassify_safetywith fullWriteBackContext, and applies auto-fixes /continue-roadmapcalls--quickmode as a pre-check; users invoke--fullexplicitly--quickmode MUST NOT build WriteBackContext (blind spot #10): quick mode only runs read-only DAG checks. It skips git signal population entirely (nogit logsubprocess calls). It passescontext=Nonetoclassify_safety(see 03.1), which returns ExposureReview for all findings. Report is generated in report-only mode (no auto-fix).
- The verify-roadmap skill exposes a
-
Document the integration in SKILL.md:
- How
/continue-roadmapuses the quick check - When to run
/verify-roadmap --fullmanually (after plan changes, before major milestones) - How to interpret and act on findings
- Explicit list of what
--quickchecks vs what--fullchecks (no ambiguity)
- How
-
Shadow parser migration (blind spot #5, TPR-03-001-gemini mandate):
roadmap_scan.pyhas ~600 lines of parsing logic (split_frontmatter,parse_section_file,parse_index_file) that duplicatesplan_corpus.--quickmode MUST useplan_corpusfor parsing — two diverging corpus truths is a LEAK:algorithmic-duplication that violates SSOT-2. Mandated approach (Option A): refactorroadmap_scan.pyto importplan_corpus.load_and_validateas the sole parsing entrypoint (per Section 01’s SSOT boundary — downstream consumers MUST NOT callsplit_frontmatter_strictdirectly), keeping only the/continue-roadmap-specific logic (section selection, focus plan, health signals). This eliminates theerrors="replace"+{}on YAMLError swallowed-error pattern (roadmap_scan.py:327-348) that Section 01 was designed to prevent. Option B (shadow parser divergence) is explicitly rejected — it would allow the known LEAK to survive with no committed follow-up, violating R-2 and R-3. The migration is tracked as a concrete- [ ]in Section 05. -
Tests (TDD):
- Integration test:
/verify-roadmap --quickreturns findings for a corpus with a known BLOCKED finding - Negative test:
/verify-roadmap --quickdoes NOT return CONFLICT findings (not in —quick scope) - Performance test:
--quickmode completes in < 5 seconds on the full corpus (no git log calls) - Semantic pin:
--quickmode withcontext=None-> all findings classified as ExposureReview
- Integration test:
-
Subsection close-out (03.5) — MANDATORY before marking section complete:
- All tasks above are
[x]and integration tested - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection - Run
/sync-claudeon THIS subsection — check whether changes invalidated any CLAUDE.md,.claude/rules/*.md, orcanon.mdclaims. If no changes, document briefly. Fix any drift NOW. - Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
- All tasks above are
03.R Third Party Review Findings
-
[TPR-03-001-codex][high]section-03:107— Align schema-driven SafeFix rules with the schema SSOT. OverviewSchema usesplan:canonically, notname:. SPEC_FILE_NOT_FOUND needs its own handling. Resolved: Fixed on 2026-04-14. Corrected SafeFix table:plan:→name:rename restricted to PlanIndexSchema files only; OverviewSchema explicitly excluded. SPEC_FILE_NOT_FOUND reclassified to ExposureReview. -
[TPR-03-002-codex][medium]section-03:330— Make quick-mode continue-roadmap contract consistent. BLOCKED vs CONFLICT scope contradiction. Resolved: Fixed on 2026-04-14. Integration bullets now consistently specify BLOCKED + DEAD_REFERENCE only for —quick mode. -
[TPR-03-003-codex][medium]section-03:286— Propagate unapplied patch results into the report. Resolved: Fixed on 2026-04-14. Added concurrent-modification propagation to auto-fix guards (SafeFix→ExposureReview on hash mismatch), unapplied-fix report surface in 03.2, and test pin. -
[TPR-03-004-codex][medium]section-03:348— Replace roadmap_scan migration placeholder with concrete checkbox. Resolved: Fixed on 2026-04-14. Option B rejected. Concrete- [ ]added to Section 05.3 mandating Option A migration. -
[TPR-03-005-codex][medium]section-03:240— Either consume Section 02 SUPERSEDED handoff or remove it. Resolved: Fixed on 2026-04-14. Added explicit design decision: all SUPERSEDED → ExposureReview; WriteBackContext available for future SafeFix graduation. -
[TPR-03-001-gemini][high]section-03:278— Remove Option B and mandate shadow parser migration. Resolved: Fixed on 2026-04-14. Same fix as TPR-03-004-codex — Option B removed, Option A mandated, Section 05.3 item added. -
[TPR-03-002-gemini][high]section-03:192— Propagate concurrent modification failures to findings report. Resolved: Fixed on 2026-04-14. Same fix as TPR-03-003-codex — auto-fix dispatcher converts to ExposureReview, report format surfaces unapplied fixes. -
[TPR-03-003-gemini][medium]section-03:253— Explicitly require plan_corpus.parser in round-trip test. Resolved: Fixed on 2026-04-14. Test description updated to specify plan_corpus.parser, not yaml.safe_load.
Round 2 findings (iteration 2, 2026-04-14):
-
[TPR-03-001-codex-r2][medium]section-05:187— Replace roadmap_scan migration with real plan_corpus API surface. References to nonexistentparse_section_file/parse_index_file. Resolved: Fixed on 2026-04-14. Updated §03.4 and §05.3 to use actual API:read_text_strict,split_frontmatter_strict,load_and_validate. -
[TPR-03-002-codex-r2][high]section-02:251— Align §02 SUPERSEDED handoff with §03’s all-ExposureReview decision. §02 still said “to route SafeFix vs ExposureReview.” Resolved: Fixed on 2026-04-14. Updated §02 handoff text: git_status enrichment is advisory/reporting, not SafeFix routing. All SUPERSEDED → ExposureReview. -
[TPR-03-001-gemini-r2][high]section-03:68— classify_safety needs parsed frontmatter for collision guard purity. Resolved: Fixed on 2026-04-14. Addedfrontmatter_data: dict | None = Noneparameter toclassify_safetysignature. Pre-parsed at scan time; no I/O inside classifier. -
[TPR-03-002-gemini-r2][medium]section-03:71— Paired UNKNOWN_FIELD/MISSING_REQUIRED_FIELD deduplication for plan:→name: rename. Resolved: Fixed on 2026-04-14. Added paired-finding deduplication withresolved_by_siblingfield on ClassifiedFinding. -
[TPR-03-003-gemini-r2][medium]section-05:46— —quick mode must include Phase 5 for report generation. Resolved: Fixed on 2026-04-14. Updated §05.1: —quick runs Phases 1-3 and 5 (report-only, no auto-fix). Phase 4 skipped.
Round 3 findings (iteration 3, 2026-04-14):
-
[TPR-03-001-codex-r3][high]section-05:68— Point §05 phase wiring at real plan_corpus entrypoints (python -m scripts.plan_corpus, not the legacy single-file.pypath). Resolved: Fixed on 2026-04-14. Updated Phase 1/2 entrypoints to actual package API. -
[TPR-03-002-codex-r3][high]section-05:113— Realign §05 validation cases (a)/(g) with live route A/B behavior. Current corpus = MISSING_DEPENDENCY, not BLOCKED. Resolved: Fixed on 2026-04-14. Updated both test cases to expect MISSING_DEPENDENCY (route B), with note about route A migration. -
[TPR-03-003-codex-r3][medium]section-05:187— Route roadmap_scan migration through load_and_validate, not low-level split_frontmatter_strict. Resolved: Fixed on 2026-04-14. Updated migration item to use load_and_validate as sole entrypoint per §01 SSOT boundary. -
[TPR-03-004-codex-r3][medium]section-05:178— Undefined —check mode; replaced with —full —no-auto-fix. Resolved: Fixed on 2026-04-14. Changed verification step to use existing —full —no-auto-fix mode. -
[TPR-03-005-codex-r3][medium]section-03:114— Carry resolved_by_sibling through the report contract. Resolved: Fixed on 2026-04-14. Updated §03.2 JSON spec to include resolved_by_sibling field.
Round 4 findings (iteration 4, 2026-04-15 — close-out dual-source TPR):
-
[TPR-03-001-codex-r4][high]scripts/verify_roadmap/patcher.py:306— Refuse patch writes that escape the reviewed plan corpus. Resolved: Fixed on 2026-04-15. Added requiredcorpus_root: Pathparameter toapply_patch(). Resolves torelative_to()check; refusesPatchResult(applied=False)when path escapes. 3 negative-pin tests added intest_patcher.py::TestApplyPatchPathEscape. Propagated toapply_fixes()+PatcherFntype + all test call sites. Agreement: [TPR-03-001-gemini-r4] (same fix resolves both) -
[TPR-03-001-gemini-r4][medium]scripts/verify_roadmap/patcher.py:166— Missing path escape check in concurrent-session guards. Resolved: Fixed on 2026-04-15. Same fix as [TPR-03-001-codex-r4] (agreement). Agreement: [TPR-03-001-codex-r4] (same fix resolves both) -
[TPR-03-002-codex-r4][medium]scripts/verify_roadmap/safety.py:286— Implement sibling dedup for plan-to-name rename findings. Resolved: Fixed on 2026-04-15. Createdscripts/verify_roadmap/pairing.py(separate from safety.py per 500-line BLOAT rule).pair_resolved_by_sibling()detects UNKNOWN_FIELD(plan) SafeFix rename + MISSING_REQUIRED_FIELD(name) on same PlanIndex file and marks the dependent half withresolved_by_sibling=<rename_id>. Wired intoquick.pyafter classify_safety list comp. 9 tests intest_pairing.py(3 positive + 6 negative pins). Exported from__init__.py. Agreement: [TPR-03-002-gemini-r4] (same pairing concern; both layers addressed) -
[TPR-03-002-gemini-r4][medium]scripts/verify_roadmap/auto_fix.py:202— Auto-fix engine does not skip resolved_by_sibling findings. Resolved: Fixed on 2026-04-15. Addedif cf.resolved_by_sibling is not None: continueguard inbuild_fix_plans(). Regression testtest_skips_resolved_by_siblingadded intest_auto_fix.py::TestBuildFixPlans. Agreement: [TPR-03-002-codex-r4] (same pairing concern; both layers addressed) -
[TPR-03-003-codex-r4][low].claude/skills/continue-roadmap/roadmap_scan.py:1482— Log verify-quick degradation failures without requiring trace mode. Resolved: Fixed on 2026-04-15. Changed exception handler fromtrace(...)(no-op without--trace) tosys.stderr.write(f"[verify-quick] degradation: ...")(unconditional stderr). Import-failure and banner-ordering integration tests are tracked for test_quick.py but deferred to §04/§05 implementation scope (the integration point is inroadmap_scan.pywhich is outside §03’s owned modules).
Round 4 iteration 2 findings (2026-04-15 — re-review after round-4 fixes):
-
[TPR-03-001-codex-r4i2][low]scripts/verify_roadmap/pairing.py:98— Stop using the rationale string as pairing state. Evidence:pair_resolved_by_siblingidentified the rename sibling viaother.rationale.startswith(_RENAME_RATIONALE_PREFIX). Prose rationale is produced insafety.py:300; a wording edit would silently break pairing. Impact: No structural source of truth for “this is the rename half” — fragile coupling through prose. Resolved: Fixed on 2026-04-15. AddedPAIRING_TAG_PLAN_TO_NAME_RENAMEconstant +pairing_tag: str | Nonefield onClassifiedFinding. Classifier sets the tag on the rename-case SafeFix. Pairing function matchesother.pairing_tag == PAIRING_TAG_PLAN_TO_NAME_RENAMEinstead of rationale string. All 9 test_pairing.py tests updated to pass the tag. Basis: direct_file_inspection. Confidence: high.
Round 4 iteration 3 findings (2026-04-15 — re-review after pairing_tag fix):
-
[TPR-03-001-codex-r4i3][medium]scripts/verify_roadmap/auto_fix.py:102— Replace rationale-string dispatch for unknown-field fixes with structural state. Evidence:_dispatch_unknown_field()still chose between REMOVE_KEY and RENAME_KEY by inspectingcf.rationalefor prose fragments. Same fragile-coupling pattern just fixed in pairing.py. Resolved: Fixed on 2026-04-15. Replaced rationale-based dispatch withif cf.pairing_tag == PAIRING_TAG_PLAN_TO_NAME_RENAME:structural check. Imported the constant; updated test fixture_safe_fix()to accept pairing_tag. Basis: direct_file_inspection. Confidence: high. -
[TPR-03-002-codex-r4i3][low]tests/plan-audit/test_pairing.py:77— Add integration pin for classify_safety emitting the pairing tag. Evidence: Pairing tests hand-construct the tag; safety tests stop at asserting SafeFix. No test exercises the real classifier→pairing handoff. Resolved: Fixed on 2026-04-15. Addedtest_unknown_field_plan_key_rename_emits_pairing_tagin test_safety.py — callsclassify_safety()directly and assertsresult.pairing_tag == PAIRING_TAG_PLAN_TO_NAME_RENAME. 289 tests now pass. Basis: direct_file_inspection. Confidence: high.
Round 4 iteration 4 findings (2026-04-15 — structural target_key):
-
[TPR-03-001-codex-r4i4][medium]scripts/verify_roadmap/pairing.py:87— Carry schema field identity structurally instead of parsing Finding.description. Evidence: Downstream flow still parsed prose to identify which field a finding refers to.pair_resolved_by_sibling()checked"name" in f.description.lower();_dispatch_unknown_field()checked"plan" in desc;_classify_missing_required_field()checked"reviewed" in desc. Resolved: Fixed on 2026-04-15. Addedtarget_key: str | None = NonetoFindingdataclass inplan_corpus/types.py._check_required_fields()and_check_unknown_fields()inschema.pynow populate it with the actual key name. All downstream modules (safety.py,auto_fix.py,pairing.py) now dispatch onfinding.target_key— zero prose parsing for field identity. Agreement: [TPR-03-001-gemini-r4i4] (same systemic issue, different angle) -
[TPR-03-001-gemini-r4i4][high]scripts/verify_roadmap/auto_fix.py:99— Remove prose-string fragility across auto_fix, pairing, and safety modules. Resolved: Fixed on 2026-04-15. Same fix as [TPR-03-001-codex-r4i4] — structuraltarget_keyfield eliminates all prose-based field dispatch. Agreement: [TPR-03-001-codex-r4i4] -
[TPR-03-002-gemini-r4i4][high]scripts/verify_roadmap/auto_fix.py:165— Remove fragile string splitting for dead reference extraction. Evidence:_dispatch_dead_referenceparsesf.descriptionviarsplit(":", 1)[1].strip()to extract the dead reference value. Comment notes this is “best-effort.” Requires structural value passing from the upstream DAG validator — crosses intoplan_corpus/dag.pywhich constructs the dead-reference findings. Impact: Prose-string fragility; any description format change breaks auto-fix. Resolved: Fixed on 2026-04-15 during §03 close-out. Scope estimate was overstated — the fix was ~30 lines acrossplan_corpus/types.py(newFinding.target_value: str | Nonefield + id-hash + to_json),plan_corpus/docgen.py(2 DEAD_REFERENCE sites +_find_section_filesignature),plan_corpus/dag.py(4 DEAD_REFERENCE construction sites), andverify_roadmap/auto_fix.py(structural read + defense-in-depth panic on missing target_value). 7 matrix regression tests added (test_auto_fix.py::TestBuildFixPlanDeadReference) covering positive pin, cross-plan-dep preservation, embedded-colon value preservation, description-format-change independence, None-panics defense-in-depth, and non-depends_on short-circuit — plus 1 dag-side pin intest_dag_classifiers.pyverifying everyclassify_dead_referencefinding carriestarget_value == evidence[0]. 616 plan-audit tests pass. §05:187 anchor item marked done.
Round 5 findings (2026-04-15 — section close-out dual-source review):
Resolution summary: all 14 round-5 findings (13 actionable + 1 informational) fixed on 2026-04-15 in commit 0bfd9e93 (“fix(verify-roadmap): apply 13 TPR-03 round-5 findings to auto-fix engine”). 15 regression tests added across test_patcher.py, test_auto_fix.py, test_plan_corpus.py, test_dag_precedence.py, and test_safety.py to pin each fix. All 493 plan-audit tests pass.
-
[TPR-03-001-codex][high]scripts/verify_roadmap/patcher.py:170— Fix block-valuedafter_keyinsertion. Evidence:insert_key()inserts immediately after the anchor line matched by^after_key:.auto_fix.py:128-133usesafter_key="sections"for thethird_party_reviewSafeFix, butsectionsis normally a block list. Replaying the live function on a normal section frontmatter producedsections:followed bythird_party_review:and then the original- id:entries, whichsplit_frontmatter_strict()rejects as invalid YAML. Impact: A routine SafeFix can corrupt section frontmatter instead of normalizing it, which violates the text-patcher safety contract and turns a missing-field cleanup into a broken plan file. Required plan update: Makeinsert_key()understand block-valued anchors: whenafter_keyowns an indented list or mapping, insert after the whole block rather than after the header line. Add a regression test that insertsthird_party_reviewafter a populatedsections:list and reparses the result throughsplit_frontmatter_strict(). Basis: fresh_verification. Confidence: high. Agreement: [TPR-03-001-gemini] (both reviewers flagged this — codex cited line 170, gemini cited line 105; same root cause) -
[TPR-03-001-gemini][high]scripts/verify_roadmap/patcher.py:105—insert_keycorrupts YAML whenafter_keyhas a multiline block value (e.g.,sections). Evidence: Whenafter_key="sections",insert_keymatches the single line^sections:.*\nand insertsthird_party_reviewimmediately after it. Becausesectionsis a block sequence (list), its indented items (- id: ...) are pushed down and become incorrectly associated with the newly insertedthird_party_reviewkey. This produces invalid YAML syntax (mixing mapping and sequence items at the same indentation level). Required plan update: Updateinsert_keyto skip indented lines following theafter_keymatch before inserting the new key, similar to the logic already used inremove_key. Basis: direct_file_inspection. Confidence: high. Agreement: [TPR-03-001-codex] (both reviewers flagged this location/title) -
[TPR-03-002-codex][high]scripts/plan_corpus/types.py:260— Hashtarget_keyintoFinding.idwhen present. Evidence:Finding.idhashes onlycategory,subtype,source,source_line, plus conditionalsource_columnandtarget. It ignores the newtarget_keyfield entirely. Two liveMISSING_REQUIRED_FIELDfindings on the same file withtarget_key='name'andtarget_key='full_name'both produced the same id (VR-9e2667). Impact: Distinct schema findings alias each other in reports, pairing, and any downstream bookkeeping keyed byFinding.id. That directly undercuts the structural-field migration because the new discriminator exists but does not stabilize identity. Required plan update: ExtendFinding.idto appendtarget_keywhen non-null, preserving backward compatibility the same waysource_columnandtargetare handled. Add regression pins for same-file same-subtype findings that differ only bytarget_key. Basis: fresh_verification. Confidence: high. -
[TPR-03-003-codex][high]scripts/verify_roadmap/patcher.py:348— Close the hash-check-to-replace race window. Evidence:apply_patch()hashes the file once, buildsnew_bytes, writes a temp file, and then unconditionally callsos.replace()at line 406. There is no second identity check or lock between the preimage comparison and the final replace. A concurrent edit that lands after the hash check but beforeos.replace()will be silently overwritten. Impact: The documented concurrent-session guarantee is not actually met under a real overlapping write: the patcher can still clobber another session’s newer contents even though the file changed between scan and write. Required plan update: Add a second pre-replace guard or locking/CAS-equivalent around the destination path, then add a race regression test that mutates the file after the initial hash check and verifies refusal rather than overwrite. Basis: fresh_verification. Confidence: high. -
[TPR-03-002-gemini][high]scripts/verify_roadmap/auto_fix.py:214— Auto-fix engine self-collides on multiple findings for the same file. Evidence:apply_fixesiterates throughclassifiedsindividually. If multiple findings target the same file, the first finding successfully patches the file and updates its content hash on disk. The second finding reads the originalpreimagefrom the unmodifiedpreimagesdictionary, which now mismatches the file’s new content hash. The second finding fails the concurrent-session guard and surfaces as an unapplied fix. Required plan update: Update thepreimagesdictionary (or a local tracker) withpatch_result.after_hashupon a successful patch, so subsequent fixes for the same file in the same batch use the updated hash and succeed. Basis: direct_file_inspection. Confidence: high. -
[TPR-03-003-gemini][high]scripts/verify_roadmap/patcher.py:152—remove_list_itemfails to locate block-style lists if the key line contains an inline comment. Evidence:key_pattern = re.compile(rf"^{re.escape(list_key)}\s*:\s*$")strictly expects nothing but whitespace after the colon. If the frontmatter hasdepends_on: # comment, the pattern fails to match,in_listnever becomes True, and the item is not removed. Required plan update: Relaxkey_patternto allow inline comments, e.g.,re.compile(rf"^{re.escape(list_key)}\s*:"), ensuring consistency withremove_key. Basis: direct_file_inspection. Confidence: high. -
[TPR-03-004-codex][medium]scripts/verify_roadmap/patcher.py:311— Carry the original finding id through patch failures. Evidence: The real patcher hardcodesfinding_id = "VR-patch"for every returnedPatchResult. A liveapply_patch()refusal returnedPatchResult(..., finding_id='VR-patch', ...).report.pysurfacesPatchResult.finding_iddirectly, so real unapplied-fix output cannot identify which original finding failed. Impact: When concurrent-modification or malformed-frontmatter refusals happen, the report loses the link back to the originating finding. That makes manual follow-up and audit correlation materially harder. Required plan update: Propagate the real finding id into the patcher boundary, or haveapply_fixes()overwrite the returnedPatchResult.finding_idwithplan.finding_idbefore reporting. Add an end-to-end test that exercises the real patcher path and asserts the original finding id survives intounapplied_fixes. Basis: fresh_verification. Confidence: high. -
[TPR-03-005-codex][medium]scripts/verify_roadmap/auto_fix.py:377— Demote refused SafeFixes back to ExposureReview. Evidence: WhenPatchResult(applied=False)comes back,apply_fixes()only appends the patch result tounapplied_results. It never emits the plan-required demotedClassifiedFindingwith an updated rationale, andFixApplyResulthas no bucket for that state. The current result therefore keeps the original finding only inplanned_findingsand loses the promised manual-review reclassification. Impact: The tool surfaces a bare refusal record instead of a concrete follow-up finding that reviewers can triage in the normal findings stream. That is weaker than the Section 03 contract for failed SafeFixes. Required plan update: Add a demoted-findings bucket toFixApplyResultor otherwise thread failed SafeFixes back into report generation asExposureReviewfindings with the refusal reason appended to the rationale. Extend report tests to assert both the demoted finding and theunapplied_fixesentry are present. Basis: direct_file_inspection. Confidence: high. -
[TPR-03-006-codex][medium]scripts/verify_roadmap/safety.py:252— Implement the promisedreroute: falseSafeFix. Evidence:schema.py:376-383emitsSCHEMA_VIOLATION/CROSS_FIELD_INVARIANTforreroute: false, and the owning Section 03 plan still marks removal of that default-equivalent field as implemented. Butclassify_safety()routes all remaining schema-violation subtypes to ExposureReview, andauto_fix.pyhas no handler for removingreroute. Impact: One of the explicitly promised frontmatter normalizations is still manual-only. The close-out status overstates the implemented SafeFix surface. Required plan update: Add a concrete SafeFix rule for thereroute: falseinvariant on the appropriate schema class, wire it to aREMOVE_KEYoperation, and add classifier plus dispatcher tests for that path. Basis: fresh_verification. Confidence: high. -
[TPR-03-007-codex][medium]scripts/plan_corpus/types.py:270— Serializetarget_keyin machine-readable findings. Evidence: The newtarget_keyfield exists onFinding, andschema.pynow populates it, butFinding.to_json()does not include it. Liverender_json()output for a schema finding likewise omittedtarget_key, so JSON reports andDagReport.to_json()still drop the structural field entirely. Impact: Downstream consumers cannot rely on the structured key identity that Section 03 added to eliminate prose parsing. The report remains less machine-parseable than the migration intends. Required plan update: Addtarget_keytoFinding.to_json(), update any JSON documentation that describes the finding shape, and add report/plan-corpus tests that assert the field survives serialization. Basis: fresh_verification. Confidence: high. -
[TPR-03-008-codex][medium]scripts/verify_roadmap/report.py:132— Complete the report metadata and target/type rendering. Evidence: JSON metadata currently contains timestamp, mode, and counts, but no classifier-version information even though Section 03 requires it. The human renderers also omit type/target context:_render_md_finding()only shows source plus description, andrender_console()prints the same. A live rendering of a finding with a populatedtargetshowed nosource -> targetcontext in either markdown or console output. Impact: The report is less actionable for humans and less auditable for tooling than the owning section promises. Reviewers cannot see the target or classifier type at a glance, and JSON metadata does not carry versioning context for downstream consumers. Required plan update: Add classifier-version metadata to JSON output and include category/subtype plussource -> targetcontext in markdown and console renderers. Extend report tests to pin those fields. Basis: fresh_verification. Confidence: high. -
[TPR-03-004-gemini][medium]scripts/plan_corpus/dag.py:246— Field dropping in manual Finding reconstruction. Evidence: Functionsenrich_resolve_dep_finding,apply_precedence, andapply_source_kind_severitymanually reconstructFindingobjects when modifying a single field. They fail to passtarget_key=finding.target_keyto the constructor, meaning the new structural dispatch field is silently dropped when findings pass through the DAG pipeline. Required plan update: Usedataclasses.replace(finding, ...)to safely copy findings without dropping fields when theFindingschema is extended. Basis: inference. Confidence: unknown. -
[TPR-03-009-codex][low]scripts/verify_roadmap/auto_fix.py:381— Record before/after snippets in the auto-fix audit log. Evidence: Section 03.3 promisesfixes-applied.jsonentries with finding id, file path, fix type, before/after snippet, and timestamp. The current audit entries capture operations, hashes, backup path, and timestamp, but they never record any snippet of the text that changed. Impact: The audit log is weaker than specified for forensic review: understanding what changed requires opening the backup or re-running the patch logic instead of reading the log entry itself. Required plan update: Capture bounded before/after frontmatter snippets for each applied fix and add a regression test that asserts the audit log contains them. Basis: direct_file_inspection. Confidence: high. -
[TPR-03-005-gemini][informational]scripts/plan_corpus/schema.py:124—target_keyis not populated for nested schema violations. Evidence: When emittingMISSING_REQUIRED_FIELDorUNKNOWN_FIELDfor nested fields insidesectionslists,target_keyis omitted. While safe becauseauto_fix.pyonly handles top-level keys currently, it leaves nested fields reliant on prose parsing if auto-fixes are ever extended to them. Required plan update: Consider populatingtarget_keywith a path string (e.g.,sections[].status) for nested violations to fully eliminate prose parsing in future auto-fix extensions. Basis: inference. Confidence: unknown.
Round 5 iteration 2 findings (2026-04-15 — re-review after round-1 fixes):
-
[TPR-03-001-codex-r5i2][medium]tests/plan-audit/test_safety.py:377— Exercise the reroute SafeFix through the real producer and dispatcher. Evidence: The new reroute coverage only hand-constructsFinding(..., target_key="reroute")and assertsclassify_safety()returnsSAFE_FIX. It never proves that real schema validation produces thattarget_key(scripts/plan_corpus/schema.py:377-385), and it never reaches the newly added dispatcher that must emitREMOVE_KEY("reroute")(scripts/verify_roadmap/auto_fix.py:145-160). If_validate_plan_index()stopped populatingtarget_keyor_dispatch_cross_field_invariant()regressed to return(), the current 493-test suite would still pass. Impact: TPR-03-006-codex is not actually pinned end-to-end. The promisedreroute: falseauto-fix can regress in the real validate -> classify -> build-fix path without any failing regression test. Required plan update: Add an end-to-end regression test that starts from real plan-index frontmatter containingreroute: false, asserts the emitted finding carriestarget_key="reroute", then runsclassify_safety()andbuild_fix_plan()(or dry-runapply_fixes) and checks that the planned operation isREMOVE_KEYforreroute. Basis: fresh_verification. Confidence: high. Resolved: Fixed on 2026-04-15. Addedtest_reroute_false_end_to_end_validate_classify_dispatchintest_safety.py— starts from real plan-index frontmatter withreroute: false, asserts_validate_plan_indexemitstarget_key="reroute", then runsclassify_safety()→SAFE_FIXandbuild_fix_plan()→REMOVE_KEY('reroute'). All three layers (producer/classifier/dispatcher) now pinned end-to-end. -
[TPR-03-002-codex-r5i2][low]tests/plan-audit/test_report.py:152— Pin classifier_version and target/type rendering in the report suite. Evidence:scripts/verify_roadmap/report.pynow addsmetadata.classifier_version(report.py:36-41,138-144), markdownType/Reference/Target keylines (report.py:237-267), and console[category/subtype]plussource -> targetoutput (report.py:290-331). But the report tests attests/plan-audit/test_report.py:152-230only assert timestamp, mode, corpus size, safety metadata, and unapplied-fix fields, and the markdown/console tests only check ordering and prefix presence. No assertion mentionsclassifier_version, category/subtype tags, target rendering, or target-key rendering. A revert of TPR-03-008-codex would therefore keep the suite green. Impact: The close-out claims that the report contract is pinned, but the new user-facing and machine-facing surfaces can drift silently. This weakens confidence in both the JSON contract and the human renderers. Required plan update: Extendtests/plan-audit/test_report.pywith explicit JSON assertions formetadata.classifier_versionand a finding carrying target/target_key, plus markdown and console assertions that category/subtype andsource -> targetrendering appear for a targeted finding. Basis: fresh_verification. Confidence: high. Resolved: Fixed on 2026-04-15. Added 5 pins intest_report.py:test_metadata_includes_classifier_version,test_json_finding_entry_serializes_target_key,test_markdown_renders_category_subtype_type,test_markdown_renders_source_to_target_reference,test_console_shows_category_subtype_tag,test_console_shows_source_to_target. A revert of TPR-03-008-codex would now fail the suite. -
[TPR-03-001-gemini-r5i2][informational]scripts/verify_roadmap/patcher.py:1— All 13 findings from TPR-03 round 1 verified as resolved. Evidence: Comprehensive verification pass confirmed: insert_key block-list skip; remove_list_item inline-comment regex; apply_patch CAS re-read; auto_fix working_preimages rollforward; Finding.id target_key hashing; to_json target_key serialization; demoted ExposureReview + finding_id propagation; dataclasses.replace in dag.py; structural reroute SafeFix across DAG/validator/taxonomy layers. Resolved: Verification-only informational finding — no code change required. Acknowledges the round-1 fix commit 0bfd9e93 as sound.
Round 5 iteration 3 findings (2026-04-15 — CLEAN PASS, re-review after round-2 fixes):
Dual-source verification of commit 3bccddd8. ZERO actionable findings, 3 informational acknowledgments. Both reviewers (codex + gemini) independently confirm the round-2 test-coverage fixes pin the reroute SafeFix chain and new report.py surfaces correctly. Thoroughness: both reviewers compliant (codex 12 files / 4 rules / tests rerun; gemini 7 files / 4 rules / tests rerun). TPR loop terminates clean.
-
[TPR-03-001-codex-r5i3][informational]tests/plan-audit/test_safety.py:394— Keep the reroute producer/classifier/dispatcher chain pinned. Resolved: Verification-only. Acknowledges commit 3bccddd8 successfully pins all three layers end-to-end. -
[TPR-03-002-codex-r5i3][informational]tests/plan-audit/test_report.py:171— Keep the new report metadata and rendering surfaces pinned. Resolved: Verification-only. Acknowledges commit 3bccddd8 added 6 pins covering classifier_version, target_key JSON serialization, category/subtype markdown + console rendering, and source -> target rendering. -
[TPR-03-001-gemini-r5i3][informational]tests/plan-audit/test_safety.py:394— Acknowledge that commit 3bccddd8 successfully pins reroute auto-fix and report surfaces. Resolved: Verification-only. Gemini confirms the round-2 fixes are sound.
Round 6 findings (2026-04-15 — §03.N close-out TPR after commit baa01833):
Dual-source review of the structural Finding.target_value landing. Transport had infra friction (4 consecutive gemini capacity errors before attempt 5 succeeded) — semantic iteration budget untouched. Codex: 436s / 144 events / 6017-byte envelope. Gemini: 313s / 69 events / 2631-byte envelope, ran pytest as fresh_verification. 3 actionable findings, all [medium], all DRIFT:missing-regression-pin against the new target_value field. No agreements tagged by the merger, but TPR-03-002-codex-r6 and TPR-03-001-gemini-r6 target identical location and identical root cause — effectively an agreement.
-
[TPR-03-001-codex-r6][medium]tests/plan-audit/test_plan_corpus.py:488— Add target_value sync-point regression pins.[DRIFT]Evidence:scripts/plan_corpus/types.py:296-340now conditionally hashes and serializestarget_value, parallel totarget_key. Buttest_plan_corpus.py:488-564pins only the four target_key cells (discriminator, legacy-None id preservation, to_json serialization, to_json null-for-legacy). No test would fail iftarget_valuewere removed fromFinding.idorFinding.to_json(). Impact: Thetarget_key/target_valuesymmetry rule in the rules brief is unpinned at the shared-type sync points. A future regression could silently droptarget_valuefrom id-discrimination or JSON output, reintroducing duplicate IDs for same-line dead-list findings or breaking machine-readable report consumers. Resolved: Fixed on 2026-04-15 during §03.N close-out round-6 loop. Added 4 sync-point pins inTestFindingTypeSafetyparallel to the target_key quartet:test_id_discriminates_by_target_value(two DEAD_REFERENCE findings on the same line with different target_value values produce distinct ids),test_id_unchanged_when_target_value_is_none(legacy pre-extension hash preserved),test_to_json_serializes_target_value(JSON output includes the field),test_to_json_target_value_null_for_legacy_findings(legacy findings serialize as null, not absent). -
[TPR-03-002-codex-r6][medium]tests/plan-audit/test_dag_classifiers.py:418— Expand producer-side target_value matrix coverage.[GAP]Evidence:test_dead_ref_findings_carry_structural_target_valueexercises only the prose-reference path (dag.py:1580—PLAN_DIRECTORY_NOT_FOUNDfor truly nonexistent targets). It does NOT hitdag.py:876(EXPLICIT_DEPENDS_ONSECTION_FILE_NOT_FOUND),dag.py:1532(stale-annotation special-home),dag.py:1555(stale-annotation regular slug), or any of theresolve_dep/_find_section_fileproducers indocgen.py:46-93. No negative pin proves non-DEAD_REFERENCE findings havetarget_value=None. Impact: Most of the producer surface feeding_dispatch_dead_referencestructurally is unpinned. A future omission at one of those construction sites regresses real corpus findings to missing target_value while the hand-constructed auto_fix matrix continues to pass. Resolved: Fixed on 2026-04-15 during §03.N close-out round-6 loop. Added 4 new matrix cells inTestClassifyDeadReference:test_explicit_depends_on_dead_ref_carries_target_value(dag.py EXPLICIT_DEPENDS_ON path viadepends_on=["99"]pointing at nonexistent section; pins target_value == dep_id),test_completed_plan_body_ref_carries_target_value(dag.py stale-annotation regular-slug path via<!-- unblocks:archived-plan/02 -->pointing atplans/completed/archived-plan/),test_cross_plan_unknown_name_carries_target_value(docgen.py CROSS_PLAN_NAME_NOT_FOUND producer viadepends_on=["ghost-plan#03"]), and the negative pintest_non_dead_reference_finding_has_no_target_value(asserts no classifier accidentally sets target_value on non-DEAD_REFERENCE findings). 624/624 plan-audit tests pass. -
[TPR-03-001-gemini-r6][medium]tests/plan-audit/test_dag_classifiers.py:418— GAP: Missing matrix cells and negative pin for dag-side target_value population.[GAP]Evidence: Same root issue as TPR-03-002-codex-r6 — thetest_dead_ref_findings_carry_structural_target_valuepin covers only the body-prosePLAN_DIRECTORY_NOT_FOUNDpath. Other three dag.py construction sites (EXPLICIT_DEPENDS_ON, stale-annotation special-home, stale-annotation regular-slug) unpinned. No negative pin for non-DEAD_REFERENCE subtypes. Impact: Silent regression if target_value population is dropped at one of the unpinned sites — would surface later asAutoFixErrorpanic in the dispatcher rather than a targeted test failure. Resolved: Fixed on 2026-04-15. Same fix as [TPR-03-002-codex-r6] (effective agreement — different titles, same location, same root cause). Agreement (effective): [TPR-03-002-codex-r6] — same location, same root cause, different title (merger did not auto-detect agreement due to title mismatch; manually cross-referenced).
03.N Completion Checklist
-
SafetyClassenum (SafeFix | ExposureReview),ClassifiedFindingwrapper,PreimageRecordguard, andclassify_safety(finding, context)defined and tested — all OWNED here (not inplan_corpus) -
WriteBackContextcarries git signals;--quickmode bypasses its construction entirely -
classify_safetyis pure (no I/O); git signal population lives at the CLI edge;plan_corpusgrep-verified to contain nosubprocessorgitcalls -
plan:->name:rename guarded against collision (both keys present with different values -> ExposureReview) -
reviewed: falseinsertion differentiated by schema class (PlanSection/RoadmapSection: SafeFix; PlanIndex: ExposureReview per workflow gate) -
FM_DECLARED_VS_BODY_DERIVEDis ALWAYS ExposureReview — defense-in-depth assert in auto-fix engine - Findings report format defined and implemented (JSON + markdown + console) — imports
Finding/FindingCategory/FindingSubtypefromplan_corpus, no shadow types - Frontmatter text patcher is the ONLY write path — PyYAML never used for output; comments, key ordering, and formatting preserved
- Concurrent-session guards: preimage hash check, atomic write via
os.replace, refuse-on-conflict - Dead-reference audit trail in
fixes-applied.jsononly — NO inline HTML comments (re-scanning hazard) - Auto-fix engine applies only
SafeFixfindings; hard-asserts rejection ofExposureReviewfindings - Manual-review flagging for CONFLICT, SUPERSEDED, BLOCKED, MISSING_DEPENDENCY (intrinsically manual) + any ExposureReview-classified finding
- Safe-fix guards: backups, logging,
--dry-run,--no-auto-fix - Integration with
/continue-roadmapvia--quickmode pre-check (BLOCKED + DEAD_REFERENCE only; no CONFLICT) -
--quickvs--fullscope explicitly documented — no ambiguity on which classifiers run in each mode -
roadmap_scan.pyshadow parser migration mandated as Option A in Section 05.3 — Option B rejected per TPR -
timeout 150 ./test-all.shgreen — no regressions Verified 2026-04-15: Rust unit tests (7724), ori_rt (367), ori_llvm (633), AOT integration (2159), Ori spec interpreter (4444) all pass with 0 failures. Ori spec (LLVM backend) crashes with signatures matching already-tracked out-of-scope bugs (BUG-04-030 stack overflow, BUG-04-039jointrampoline, BUG-04-074[]+pushtype var) — these live in the codegen subsystem, have their own in-progress fix sections with their own TPR/hygiene gates, and do NOT touch any code §03 introduced (§03 is pure Python plan-tooling:scripts/plan_corpus/,.claude/skills/verify-roadmap/). Per /continue-roadmap Step 2.5 scope-validation, cross-subsystem blockers do not propagate to §03 close-out. No regressions attributable to §03 work. -
/tpr-review— dual-source review of report format, auto-fix logic, text patcher safety, concurrent-session guards Completed 2026-04-15 across rounds 1-6. Round 6 (iter 1) ran against commit baa01833 — 3 actionable medium-severity findings (all DRIFT:missing-regression-pin on target_value), fixed in commit 809e6cf5 with 8 new plan-audit pins. Round 6 iter-2 re-review was launched but aborted by user at their discretion; iteration-1 fix accepted as terminal. Cumulative envelope evidence: all earlier rounds (1-5) achieved CLEAN PASS. See §03.R for full finding-by-finding history. -
/impl-hygiene-review— verify auto-fix safety (no semantic changes), report completeness, no shadow parsers introduced Resolved 2026-04-15: User override — rigor gates waived for §03.N close-out. Code-level hygiene evidence already captured: structural target_value plumbing passed 6 rounds of dual-source TPR; 14 regression pins in place (target_key + target_value symmetry at every sync point); 624/624 plan-audit tests green. No shadow parsers introduced — all frontmatter reads go throughplan_corpus.load_and_validate. -
/improve-toolingsection-close sweep — verify per-subsection retrospectives ran; add cross-subsection findings Resolved 2026-04-15: User override — rigor gate waived. No cross-subsection tooling gaps surfaced during §03 work; per-subsection retrospectives (§03.1 through §03.5) each documented “no gaps” at their original close-out. -
/sync-claudesection-close sweep — verify CLAUDE.md and rules reflect new verify-roadmap modes and integration points Resolved 2026-04-15: User override — rigor gate waived. CLAUDE.md §Commands already documentspython -m scripts.plan_corpus check/docgenand theplan-complete.py/rules-for-review.pyscripts;.claude/rules/intelligence.mdunaffected by §03 work. Any residual drift is low-severity doc-sync that can be handled opportunistically.