99%

boundaries, YAML parse errors, non-mapping YAML roots, duplicate keys, YAML anchors/aliases/merge keys, multi-document YAML, BOM or zero-width chars before---`, unknown fields (whitelist), CRLF-in-body boundary drift, replacement-character decode failures”

  • “Schema owners defined for ALL seven file classes: plan index (plans/*/index.md), plan section file (plans/*/section-*.md), roadmap section file (plans/roadmap/section-*.md — distinct tier, last_verified, spec fields), overview (plans/*/00-overview.md), bug-tracker sections (plans/bug-tracker/section-*.md), fix-BUG files (plans/bug-tracker/fix-BUG-*.md), completed-plan indexes (plans/completed/*/index.md)”
  • “Canonical Finding dataclass + severity enum + two-level FindingCategory × FindingSubtype taxonomy defined in plan_corpus.py — Sections 02, 03, 04 import the category/subtype hierarchy (no shadow taxonomy; Section 04 item-verifier subtypes like MISSING_MATRIX_COVERAGE, MISSING_SEMANTIC_PIN MUST be enumerated here)”
  • “Canonical status normalizer in plan_corpus.py produces plain StatusContradiction / SchemaViolation findings (facts only, no policy); Section 03’s write-back engine consumes these findings and classifies them into SafeFix/ExposureReview (policy lives at the write-back boundary, not in the pure parser library)”
  • depends_on convention: logical IDs only — intra-plan uses bare \"NN\" (matches live corpus, e.g. plans/completed/iter-rc-contract/section-02-elem-dec-fn.md:10 uses [\"01\"]); cross-plan uses \"plan-name#section_id\" where plan-name resolves against the name: field of the target plan’s index.md (stable across directory renames); full paths are REJECTED by the strict parser; every plan index.md MUST declare name:
  • “Discovery walks directories not globs with a two-stage classifier: (stage 1) a directory is a plan candidate if it has index.md OR matches a known section pattern (plans/roadmap/section-*.md) OR is a non-container with *.md contents; container directories (plans/, plans/completed/, plans/bug-tracker/ when aggregating fixes) are recognized and exempted from the missing-index rule; (stage 2) classified plan candidates without index.md are flagged GAP_MISSING_FILE; plans/completed/*/ nested indexes are discovered”
  • “Explicit load_and_validate(path) -> Either[Finding, ValidatedFile] boundary function catches CorpusParseError at a single site and converts it into a Finding(PARSE_ERROR, severity=high) — no ad-hoc try/except at call sites”
  • “Generated schema reference at docs/internal/plan-schema-reference.md is drift-gated by a plan_corpus.py --docgen --check mode wired into ./test-all.sh (CI fails if committed markdown differs from fresh Python-dataclass output)”
  • “Fixture corpus in tests/plan-audit/fixtures/ covers every YAML failure class, every schema-violation class, and every status-reconciliation case; includes semantic pins (only strict mode accepts) and negative pins (must reject)”
  • “Pilot migration exercises ALL seven schema classes with at least one exemplar artifact each (plan index, plan section, roadmap section, overview, bug-tracker section, fix-BUG, completed-plan index); full-corpus sweep is deferred to Section 05.3 (single ownership)”
  • “All new code passes timeout 150 ./test-all.sh” inspired_by: [] depends_on: [] third_party_review: status: resolved updated: “2026-04-14” sections:
  • id: “01.1” title: “Strict Parser & Discovery” status: complete
  • id: “01.2” title: “Schema as Python Types (Sole SSOT)” status: complete
  • id: “01.3” title: “Shared Finding & Classifier Types” status: complete
  • id: “01.4” title: “Canonical Status Normalizer” status: complete
  • id: “01.5” title: “Fixture Corpus & TDD Tests” status: complete
  • id: “01.6” title: “Pilot Migration (all seven schema classes)” status: complete
  • id: “01.7” title: “Extensions Ratified by §02” status: complete
  • id: “01.R” title: “Third Party Review Findings” status: complete
  • id: “01.N” title: “Completion Checklist” status: complete

Section 01: Frontmatter Schema, Strict Parser & Shared Types

Status: Not Started

Goal: Establish scripts/plan_corpus.py as the single source of truth for corpus parsing and schema. Schema is defined as Python dataclasses inside the library (markdown documentation is derived, not authored separately). The parser is STRICT — it hard-fails on every class of malformed YAML instead of silently coercing to empty dicts. Sections 02-05 import plan_corpus.py; they never re-parse. A fixture corpus drives TDD; a pilot migration exercising every one of the seven schema classes proves the pipeline end-to-end before the full sweep (deferred to Section 05.3).

Success Criteria: see frontmatter.

Context & why strictness matters. The current corpus parser (.claude/skills/plan-audit/planlib.py) is the cautionary example:

  • planlib.py:250read_text() uses errors="replace", silently producing U+FFFD replacement chars on decode failure; downstream parsers never learn decoding was lossy.
  • planlib.py:253-270split_frontmatter() returns ({}, 0) when --- is missing or unclosed, and returns {} on any yaml.YAMLError. A file with broken YAML is indistinguishable from a file with no frontmatter.
  • planlib.py:350-351parse_section_file() returns None when frontmatter is empty. Since malformed YAML yields empty frontmatter (line 269), every section with unparseable YAML is silently dropped from analysis.

This is LEAK:swallowed-error (ref .claude/rules/impl-hygiene.md). A schema-first auditor with this parser is worse than no auditor — it reports “clean” on corrupt inputs. The redesigned library must hard-fail.

Architectural decision — schema IS the Python module. Splitting “Schema Definition” (prose markdown) from “Validation Script” (Python re-encoding the schema) guarantees drift (LEAK:scattered-knowledge). The schema lives as Python dataclasses/TypedDicts/Pydantic models inside plan_corpus.py. Field validation rules are executable. Markdown documentation (e.g. a schema reference in 00-overview.md or a generated doc) is derived from the types, not authored independently. This mirrors compiler ModuleId / TypeId ownership (ref .claude/rules/canon.md §6 SSOTs).

Architectural decision — depends_on uses logical IDs resolved via name. Full repo-relative paths (as section 02/03/04/05 originally proposed in their depends_on fields) hardcode filenames into semantic dependency identity. A file rename breaks every dependent. The live corpus already uses short IDs: depends_on: ["01"], ["04B"], ["00"] (ref plans/completed/iter-rc-contract/section-02-elem-dec-fn.md:10, plans/completed/jit-exception-handling/section-06-lcfail-resolution.md:11). We standardize on logical IDs:

  • Intra-plan: bare "NN" (e.g. "01", "04B") — matches live usage.
  • Cross-plan: "plan-name#section_id" where plan-name is the value of the name: field in the target plan’s index.md (e.g. "Locality Representation Unification#02" or its short canonical form). The name: field — not the directory slug — is the stable logical identifier. Directory names are physical file layout; git mv plans/foo plans/bar must NOT break every cross-plan depends_on. plan_corpus.resolve_dep() builds a name → plan_dir index at discovery time and looks up against it.
  • Every plan index.md MUST declare name: (enforced by PlanIndexSchema) so cross-plan resolution is deterministic.

Full paths are REJECTED by the strict parser. Sections 02-05 resolve logical IDs to physical paths via plan_corpus.resolve_dep(). This IS the section 02/03/04/05 depends_on frontmatter fix — those fields are updated as part of this section’s pilot migration (01.6).

Architectural decision — write-back POLICY lives in Section 03, not here. The previous iteration of this section owned a SafeFix/ExposureReview taxonomy inside 01.4. That violated §Phase Boundaries (.claude/rules/impl-hygiene.md §Phase Boundaries): plan_corpus.py is a pure parsing + normalization library; auto-fix safety classification is write-back policy (it depends on runtime inputs like has_recent_commits from git, and it governs file mutation decisions). Policy has been relocated to Section 03. Section 01.4 now emits plain StatusContradiction findings (facts only). Section 03’s auto-fix engine consumes findings + caller-supplied git signals and assigns a safety class at write-back time. This removes the double source of truth and keeps plan_corpus.py testable without a git repo.

Architectural decision — Finding taxonomy is two-level, covers all phases. The original 01.3 ClassifierType enum was a flat enum sized for Phase 2/3 classifiers only. Phase 4 (Section 04 item-verifier) emits its own subtypes (matrix coverage, semantic pin, hygiene audit, checkbox verification). Without explicit ownership in 01.3, Phase 4 would invent a shadow enum. 01.3 now defines a two-level hierarchy FindingCategory × FindingSubtype where FindingCategory ∈ {SCHEMA_VIOLATION, STATUS_CONTRADICTION, DAG_CONFLICT, DEAD_REFERENCE, ITEM_VERIFICATION, PARSE_ERROR, GAP} and FindingSubtype enumerates each category’s fine-grained kinds (e.g. ITEM_VERIFICATION contains MISSING_MATRIX_COVERAGE, MISSING_SEMANTIC_PIN, WEAK_TEST, HYGIENE_VIOLATION, INCOMPLETE_CHECKBOX, SCOPE_GAP). Section 04 MUST import this hierarchy; no shadow taxonomy.

Depends on: Nothing — Phase 1 foundation.


01.1 Strict Parser & Discovery

File(s): scripts/plan_corpus.py (new library module, primary home for parsing)

Build the strict YAML frontmatter parser and the directory-walking corpus discovery. Every downstream tool (Section 02 DAG builder, Section 03 report, Section 04 item verifier, Section 05 sweep) imports from this module — no re-parsing anywhere else.

  • Create scripts/plan_corpus.py with module-level docstring identifying it as the corpus SSOT; cross-link from .claude/rules/impl-hygiene.md §Cross-Phase Invariant Contracts when the library lands (/sync-claude verifies this during 01.N).

  • Implement strict file reader read_text_strict(path: Path) -> str:

    • Use errors="strict" (NOT errors="replace" — direct inversion of the planlib.py:250 anti-pattern, LEAK:swallowed-error)
    • Raise CorpusParseError with path+byte offset on UnicodeDecodeError
    • Detect and reject UTF-8 BOM at byte 0 (YAML spec forbids BOM inside a document; many editors silently insert one)
    • Detect and reject zero-width chars (U+200B, U+FEFF, U+200E, U+200F) anywhere before the opening --- — these break YAML parsing in confusing ways
    • Detect CRLF inside YAML frontmatter region and normalize consistently OR reject (decide by measuring corpus; default: normalize to LF for parse, preserve original on write)
  • Implement strict frontmatter splitter split_frontmatter_strict(text, path) -> (dict, body_offset):

    • Require line 1 to be exactly --- (no leading whitespace, no trailing chars) — else CorpusParseError("missing opening --- on line 1")
    • Require a matching closing --- on its own line — else CorpusParseError("unclosed frontmatter")
    • Use yaml.safe_load inside a wrapper that intercepts yaml.YAMLError and re-raises as CorpusParseError with YAML error line/col translated to file line/col — NEVER return {} on parse error (direct inversion of planlib.py:268-269, LEAK:swallowed-error)
    • Reject YAML anchors (&anchor), aliases (*anchor), and merge keys (<<:) by pre-scanning raw frontmatter text for these markers — they defeat field whitelisting and allow schema bypass
    • Reject multi-document YAML (any --- or ... within the frontmatter region other than the boundaries) — YAML allows multiple docs; we do not
    • Reject non-mapping root (list, scalar, null) — frontmatter MUST be a YAML mapping
    • Detect duplicate keys: use yaml.SafeLoader subclass that raises on duplicate mapping keys (default loader silently keeps last)
    • Coerce type errors: if schema expects list[str] but YAML parses a scalar string, reject with a clear message (GEMINI finding 12 — depends_on: section-01 silently iterates chars otherwise)
  • Implement discovery walker discover_corpus(root: Path) -> Corpus with a two-stage classifier (fixes CODEX-01-005 — container directories must not be misclassified as broken plans):

    • Walk directories under plans/ (including plans/completed/, plans/bug-tracker/, plans/roadmap/) via os.walk or Path.rglob — NOT a glob that only finds *.md files
    • Stage 1 — classify each directory:
      • plan_candidate if: the directory has an index.md, OR the directory matches a known roadmap pattern (plans/roadmap/ contains section-*.md + index.md as a single plan — the plan IS the roadmap)
      • container if: the directory is a recognized aggregator that holds plan candidates as children but is NOT itself a plan — specifically plans/ (the corpus root), plans/completed/ (holds N completed-plan directories as children), and plans/bug-tracker/ when treated as the bug-fix aggregator (its index.md DOES exist, so it is BOTH a plan candidate AND a container — routed as a plan)
      • unknown otherwise (e.g. plans/fake-plan/ containing section-01.md but no index.md) — an explicit finding class
    • Stage 2 — apply missing-index rule only to plan candidates:
      • plan_candidate WITHOUT index.md → emit Finding(category=GAP, subtype=MISSING_INDEX_MD) (GEMINI finding 11) at high severity
      • container directories → exempt (no finding)
      • unknown directories containing *.md → emit Finding(category=GAP, subtype=UNCLASSIFIED_DIRECTORY) (low severity; suggests adding index.md or moving files)
    • Classify each discovered file by path pattern: plan index / plan section / roadmap section / overview / bug-tracker section / fix-BUG / completed-index (feeds 01.2 schema dispatch — seven classes; see 01.2)
    • Return a Corpus struct: {indexes: dict, plan_sections: dict, roadmap_sections: dict, overviews: dict, bug_sections: dict, fix_bug_files: dict, completed_indexes: dict, name_index: dict[str, Path], gaps: list[Finding]} where name_index maps PlanIndexSchema.name → plan directory (drives cross-plan depends_on resolution — see 01.2 DepId notes and GEMINI-01-002)
    • Support discover_corpus(root, include=[...]) subset filtering for fast --quick and pilot runs
  • Implement CorpusParseError exception class: path, line, column, raw YAML error (if any), explanatory message. All strict-mode failures raise this; no silent coercion anywhere in the parser.

  • Implement load_and_validate(path: Path) -> LoadResult — the single CorpusParseError → Finding boundary (fixes GEMINI-01-003):

    • LoadResult is a tagged union: Ok(ValidatedFile) or Err(Finding) — callers MUST match on the tag; no try/except at call sites
    • Implementation: call read_text_strict + split_frontmatter_strict + validate(schema, parsed, path) inside a single try block; catch CorpusParseError and convert to Finding(category=PARSE_ERROR, subtype=<specific>, severity=high, source=path, source_line=err.line, description=err.message) — the conversion is the ONLY place that bridges the exception boundary
    • On schema violations (returned as a list by validate()), produce one Finding(category=SCHEMA_VIOLATION, subtype=...) per violation in the LoadResult.violations list (non-fatal — file still classified as ValidatedFile)
    • Sections 02–05 import and use load_and_validate; they NEVER call split_frontmatter_strict directly (prevents per-caller exception-handling drift, LEAK:scattered-knowledge)
    • Fixture pin (cross-link 01.5): an empty file AND a malformed-YAML file both produce Err(Finding(PARSE_ERROR, ...)), NOT a crash or a silent None
  • Codebase-hygiene sweep along the way:

    • Audit .claude/skills/plan-audit/planlib.py lines 250, 253-270, 350-351 and add a FILE-level TODO(vr-redesign): comment pointing readers at scripts/plan_corpus.py as the replacement (prevents future contributors from copying the anti-pattern)
    • Verify no other script under scripts/ or .claude/skills/*/ uses errors="replace" on plan/markdown inputs; if found, add them to the migration list for Section 05.3
    • WASTE check: planlib.py will be SUPERSEDED by plan_corpus.py. Add explicit supersedes entry in 00-overview.md frontmatter for this section’s output.
  • Subsection close-out (01.1) — MANDATORY before starting 01.2:

    • All tasks above are [x] and scripts/plan_corpus.py parser imports cleanly
    • Unit tests for parser failure classes exist (deferred binding to 01.5 fixtures — cross-link here)
    • Update this subsection’s status in section frontmatter to complete
    • Run /improve-tooling retrospectively on THIS subsection — Retrospective 01.1: greenfield implementation, no debugging friction. Path resolution bug caught by TDD.
    • Run /sync-claude on THIS subsection — CLAUDE.md §Commands and §Key Paths updated with plan_corpus.py

01.2 Schema as Python Types (Sole SSOT)

File(s): scripts/plan_corpus.py (schema types live alongside parser; single module)

Define the frontmatter schema as executable Python dataclasses. This module is the sole home for schema knowledge — markdown documentation is derived from the types, not re-authored. Seven file classes have distinct schemas. The seventh class (RoadmapSectionSchema) was added in response to GEMINI-01-001 after verifying that plans/roadmap/section-*.md files carry tier: int, last_verified: date, and a spec: list[str] field that do NOT fit PlanSectionSchema.

Closed status enum — corpus-derived (not invented). The live corpus uses the following plan-level status values (verified via grep -rh "^status:" plans/*/index.md | sort -u):

  • active — work in progress
  • queued — approved, not yet started
  • resolved — completed and archived
  • not-started — planned but un-queued
  • in-progress — section-level only
  • complete — section-level only
  • research — used by plans/ori-ui-framework/index.md:5 for exploration-phase plans (DO NOT remove; the original 01 draft invented a closed enum that excluded this)

Section-level enum: not-started | in-progress | complete. Plan-level enum: active | queued | resolved | not-started | research. If the corpus surfaces additional real values during pilot migration, add them to the enum via a /create-draft-proposal — never by silently coercing.

  • Define PlanIndexSchema dataclass for plans/*/index.md:

    • Required: name: str (stable logical identifier — resolves cross-plan depends_on via "plan-name#NN"; fixes GEMINI-01-002 — directory slug is NOT used), full_name: str, status: PlanStatus, reviewed: bool (default false on newly-migrated plans)
    • Optional: reroute: bool (presence means reroute; absence means non-reroute — REJECT reroute: false), parallel: bool (presence means permanent parallel plan, e.g. bug-tracker), order: int | None, supersedes: list[str], references: list[str], inspired_by: list[str]
    • Rejection: plan: key (use name:), title: at root (use full_name:), unknown fields (whitelist enforcement)
    • Cross-field: reroute=True REQUIRES non-empty supersedes
    • Cross-corpus: every name value MUST be unique across discover_corpus.name_index — duplicate names are SCHEMA_VIOLATION (prevents ambiguous cross-plan lookup)
  • Define PlanSectionSchema dataclass for plans/*/section-*.md (EXCLUDING plans/roadmap/section-*.md — those use RoadmapSectionSchema):

    • Required: section: str (must match NN in filename), title: str, status: SectionStatus, reviewed: bool, goal: str, success_criteria: list[str], sections: list[SubsectionEntry]
    • Required: third_party_review: TprInfo ({status: TprStatus, updated: date | None})
    • Optional: depends_on: list[DepId] (default []), inspired_by: list[str] (default [])
    • TprStatus enum: none | findings | resolved | clean — corpus-derived (verified against plans/bug-tracker/fix-BUG-*.md and plan sections); clean is the terminal value for passing reviews (see plans/bug-tracker/fix-BUG-04-045.md:16)
    • Cross-field: tpr_status=none REQUIRES updated=None; other statuses REQUIRE a date; status=complete REQUIRES all sections[].status=complete
  • Define RoadmapSectionSchema dataclass for plans/roadmap/section-*.md (the seventh schema class; fixes GEMINI-01-001):

    • Required: section: int | str (roadmap uses bare integer section: 0 — verified at plans/roadmap/section-00-parser.md:2; must match NN in filename), title: str, status: SectionStatus, reviewed: bool, goal: str, sections: list[SubsectionEntry]
    • Required: tier: int (roadmap-specific priority tier, values 0-N — verified at section-00-parser.md:7), last_verified: date (roadmap-specific verification timestamp — verified at section-00-parser.md:6)
    • Optional: spec: list[str] (list of spec file references — verified at section-00-parser.md:9-16), depends_on: list[DepId], third_party_review: TprInfo
    • Whitelist extended to include tier, last_verified, spec which would be REJECTED by PlanSectionSchema as unknown fields
    • Cross-field: status=complete REQUIRES all sections[].status=complete (same as PlanSectionSchema)
    • Discovery routing: discover_corpus dispatches plans/roadmap/section-*.md to this schema via path prefix match before falling through to PlanSectionSchema
  • Define OverviewSchema dataclass for plans/*/00-overview.md (CODEX finding 7 — previously missing owner):

    • Required: plan: str (matches directory name), title: str, status: OverviewStatus
    • OverviewStatus = Literal["not-started", "in-progress", "research", "complete"] — corpus-derived (verified on 2026-04-14 via grep -h "^status:" plans/*/00-overview.md plans/completed/*/00-overview.md | sort -u; active plans use in-progress, not-started, research; the 15 completed-plan overviews under plans/completed/*/00-overview.md use complete). The original (round-3) restriction excluding complete was a survey gap — only active overviews were sampled. resolved does NOT appear on overviews in the live corpus (it’s an index.md-only value for resolved reroutes); reject it with migration hint pointing at plans/*/index.md. If the corpus surfaces additional real values during pilot migration, add them via /create-draft-proposal — never silently coerce.
    • Optional: supersedes: list[str], references: list[str]
    • Body requirement: MUST contain ## Mission Success Criteria section with at least one - [ ] or - [x] item (discovered by a body-scan rule, not just frontmatter validation)
  • Define BugTrackerSectionSchema dataclass for plans/bug-tracker/section-*.md:

    • Required: section: str, title: str, status: SectionStatus, goal: str, sections: list (may be empty)
    • Optional: no reviewed / no third_party_review (bug-tracker sections aggregate fix files; individual fixes carry TPR)
    • Verified against live schema at plans/bug-tracker/section-01-parser-lexer.md:1-6
  • Define FixBugSchema dataclass for plans/bug-tracker/fix-BUG-*.md (CODEX finding 7; cross-field rule re-derived from live corpus per CODEX-01-002):

    • Required: bug: str (matches BUG-NN-NNN pattern), title: str, severity: critical|high|medium|low, status: FixStatus, goal: str, success_criteria: list[str], subsystem: str, found: date, source: str (e.g. tpr-review, user-report, continue-roadmap)
    • Required: third_party_review: TprInfo ({status: TprStatus, updated: date | null}) — the field itself is mandatory; its status value is NOT constrained by status
    • Verified against live schema at plans/bug-tracker/fix-BUG-04-077.md:1-18, fix-BUG-03-005.md:1-19, fix-BUG-04-041.md:1-18, fix-BUG-04-045.md:1-18, fix-BUG-04-047.md, fix-BUG-04-059.md
    • TprStatus values observed in live corpus: none, findings, resolved, clean — all accepted (same enum as PlanSectionSchema.third_party_review.status)
    • NO status=complete ⇒ tpr=resolved rule — the live corpus contradicts it (CODEX-01-002 verified: fix-BUG-04-077.md:5,15-17, fix-BUG-03-005.md:5,17-19, and fix-BUG-04-041.md:5,16-18 are ALL status: complete with third_party_review.status: none). Enforcing such a rule would reject the corpus it claims to model. The fix workflow’s TPR step is OPTIONAL for many bug fixes; constraining it would break real practice.
    • Cross-field (derived from live corpus, relaxed): status=complete REQUIRES third_party_review.status ∈ {none, clean, resolved, findings} — i.e. the field must exist but any value is allowed. status=in-progress with third_party_review.status=resolved is flagged as STATUS_CONTRADICTION (completed review on unfinished fix — suggests the fix was marked complete, then reopened)
    • Fixture coverage (01.5) MUST include positive pins for ALL four combinations: complete+none, complete+clean, complete+resolved, complete+findings (reflecting the real corpus shape)
  • Define CompletedIndexSchema dataclass for plans/completed/*/index.md (CODEX finding 7):

    • Required: name: str, full_name: str, status: resolved (closed to single value — completed plans are resolved by definition)
    • Optional: reroute: bool, order: int
    • Verified against live schema at plans/completed/aims-10/index.md:1-6
  • Define DepId parser/validator (the depends_on convention; fixes GEMINI-01-002):

    • Intra-plan format: "NN" or "NNA" (matches ^[0-9]+[A-Za-z]*$) — e.g. "01", "04B"
    • Cross-plan format: "plan-name#NN" where plan-name matches the name: field declared in some discovered plan’s index.md — e.g. "Locality Representation Unification#02" (or whatever the target plan’s declared name is). The name field is the stable logical identifier; directory slugs are physical layout and MUST NOT appear in cross-plan IDs.
    • Resolution: plan_corpus.resolve_dep(dep_id, current_plan) uses Corpus.name_index to map plan-name → target plan directory → target section file. Unknown plan-nameDEAD_REFERENCE finding. Duplicate name across plans → SCHEMA_VIOLATION finding (both plans flagged).
    • Rejection: full paths like "plans/X/section-NN-*.md" REJECTED with suggested logical-ID rewrite
    • Rejection: bare scalar string (depends_on: section-01 instead of depends_on: ["01"]) REJECTED (GEMINI finding 12 — prevents silent char iteration)
    • Rejection: cross-plan IDs that look like a directory slug but don’t match any declared name → reject with “did-you-mean” hint listing the real name values of similarly-named plans
  • Define validate(schema, parsed_yaml, path) -> list[Finding]:

    • Dispatches to the right dataclass based on path classification from 01.1 (seven dispatch arms — one per schema class)
    • Whitelist enforcement: unknown fields produce Finding(category=SCHEMA_VIOLATION, subtype=UNKNOWN_FIELD) (NOT blacklist — typos like stauts: must be caught)
    • Returns a list (does not raise); empty list means valid
    • All violations are Finding instances with category=SCHEMA_VIOLATION and specific subtype (import from 01.3’s two-level taxonomy)
    • Does NOT assign a safety_class — safety classification is Section 03’s responsibility (see architectural decision above)
  • Document the derivation story (NOT separate schema prose): generate a short docs/internal/plan-schema-reference.md from the dataclass definitions via a helper python scripts/plan_corpus.py --docgen > docs/.... The dataclass docstrings are the SSOT; the markdown is a projection. Include a banner on the generated markdown: <!-- GENERATED from scripts/plan_corpus.py — do not edit -->.

  • Implement --docgen --check drift-gate mode (fixes GEMINI-01-006):

    • python scripts/plan_corpus.py --docgen --check regenerates the schema reference in memory and compares it against the committed file at docs/internal/plan-schema-reference.md
    • Exits non-zero with a clear diff on any mismatch; exits zero when they are byte-identical (LF line endings, see .claude/rules/impl-hygiene.md §Cross-Platform Parity)
    • Wired into ./test-all.sh as a new test family (added in 01.5’s runner bootstrap and in 01.N CI enforcement) — CI fails if generated docs drift from the Python SSOT
    • Prevents the “generated file becomes a second source of truth” LEAK (editors manually edit the markdown, Python types drift, nobody notices)
  • Subsection close-out (01.2) — MANDATORY before starting 01.3:

    • All tasks above are [x]; schema dispatch covers all seven file classes (plan index, plan section, roadmap section, overview, bug-tracker section, fix-BUG, completed-plan index)
    • Update this subsection’s status in section frontmatter to complete
    • Run /improve-tooling retrospectively on THIS subsection — Retrospective 01.2: no gaps
    • Run /sync-claude on THIS subsection — CLAUDE.md §Key Paths updated

01.3 Shared Finding & Classifier Types

File(s): scripts/plan_corpus.py (types co-located with schema; no separate module)

Define the canonical boundary types that Sections 02-04 import. Without this, Section 02 would invent its own Finding enum, Section 03 its own report schema, Section 04 its own violation record — three shadow types that drift immediately (CODEX finding 5, GEMINI finding 14 — “Shared Diagnostic type between 02 and 03”). This section owns the FULL finding taxonomy including Phase 4 item-verification subtypes (fixes CODEX-01-004 — prevents Phase 4 from inventing a shadow enum).

  • Define Severity(Enum): critical | high | medium | low (ordered, comparable).

  • Define FindingCategory(Enum) — top-level finding category (fixes CODEX-01-004 by replacing flat ClassifierType with a two-level hierarchy):

    • PARSE_ERRORCorpusParseError lifted into a Finding by load_and_validate (01.1)
    • SCHEMA_VIOLATION — frontmatter fails a dataclass schema (01.2’s validate())
    • STATUS_CONTRADICTION — declared vs derived status mismatch (01.4 normalizer)
    • DAG_CONFLICT — Phase 2 graph-analysis findings (Section 02)
    • DEAD_REFERENCE — path or cross-plan reference resolves to nothing (Sections 01/02)
    • ITEM_VERIFICATION — Phase 4 section-item findings (Section 04 MUST reuse this — no shadow category)
    • GAP — discovery-stage gaps (missing index.md on a plan candidate, unclassified directories)
  • Define FindingSubtype(Enum) — fine-grained subtype (scoped per category; each subtype belongs to exactly one category). Enumerated per category below.

    PARSE_ERROR subtypes: MISSING_OPENING_DASHES, UNCLOSED_FRONTMATTER, YAML_SYNTAX_ERROR, NON_MAPPING_ROOT, DUPLICATE_KEY, YAML_ANCHOR, YAML_MERGE_KEY, MULTI_DOCUMENT, UTF8_BOM, ZERO_WIDTH_BEFORE_FM, INVALID_UTF8_BYTES, CRLF_BOUNDARY_DRIFT (one subtype per 01.5 fixture).

    SCHEMA_VIOLATION subtypes: UNKNOWN_FIELD, MISSING_REQUIRED_FIELD, WRONG_TYPE, ENUM_OUT_OF_RANGE, CROSS_FIELD_INVARIANT, DUPLICATE_PLAN_NAME, DEP_ID_MALFORMED, DEP_ID_FULL_PATH, DEP_ID_UNKNOWN_NAME.

    STATUS_CONTRADICTION subtypes: FM_DECLARED_VS_BODY_DERIVED (frontmatter status disagrees with body-derived signal), PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED (plan marked active but every section not-started — derived should be queued), PLAN_COMPLETE_WITH_OPEN_SECTIONS (plan status: complete while sections[].status contains non-complete), TPR_STATUS_WITHOUT_DATE (non-none TPR status with missing updated), TPR_STATUS_NONE_WITH_DATE (TPR status none paired with a non-null updated), CROSS_EDGE_TEMPORAL_DRIFT (Section 02 DAG classifier: dependent plan’s declared status presupposes a state its prerequisite has not reached — drift visible only across the dependency edge, never emitted by 01.4’s intra-file normalizer), TPR_STALE_VS_EDIT (Section 02 DAG classifier: a plan’s third_party_review.updated predates mtime on files it depends on — the reviewed snapshot is stale relative to the upstream edits).

    DAG_CONFLICT subtypes: CONFLICT (contradictory goals same subsystem, high), SUPERSEDED (reroute with incomplete rewrite, medium), BLOCKED (active plan depends on queued prereq, high), MISSING_DEPENDENCY (shared subsystem without documented edge, medium), CYCLE (DAG cycle detected, high).

    DEAD_REFERENCE subtypes: PLAN_DIRECTORY_NOT_FOUND, SECTION_FILE_NOT_FOUND, CROSS_PLAN_NAME_NOT_FOUND, SPEC_FILE_NOT_FOUND (for roadmap sections’ spec: list).

    ITEM_VERIFICATION subtypes (OWNED here; Section 04 imports, does NOT redefine): MISSING_MATRIX_COVERAGE (medium), MISSING_SEMANTIC_PIN (high), MISSING_NEGATIVE_PIN (high), WEAK_TEST (medium), HYGIENE_VIOLATION (low), INCOMPLETE_CHECKBOX (high — item marked [x] without evidence), SCOPE_GAP (medium).

    GAP subtypes: MISSING_INDEX_MD (plan candidate directory without index.md, high), UNCLASSIFIED_DIRECTORY (directory with *.md but neither index.md nor a known section pattern, low), LEAK_SWALLOWED_ERROR (self-monitoring: detected errors="replace" anti-pattern in corpus tooling, high).

  • Define Finding dataclass (frozen, hashable):

    • id: strVR-NNN (stable across runs via content hash, not sequence number — avoids churn)
    • category: FindingCategory
    • subtype: FindingSubtype
    • severity: Severity
    • source: Path, source_line: int | None
    • target: Path | None, target_line: int | None
    • description: str — one-line summary
    • recommended_fix: str — imperative, actionable (ref .claude/rules/impl-hygiene.md diagnostic style)
    • evidence: list[str] — supporting quotes, field values, line excerpts
    • Validation invariant (__post_init__): subtype MUST belong to category — caught at construction time, no runtime drift
    • NOTE: No auto_fixable or safety_class field on Finding — those are Section 03 write-back annotations (see architectural decision at top of section; fixes CODEX-01-003). Section 03 wraps each Finding in a ClassifiedFinding(finding, safety_class, rationale) record at write-back time.
  • Define Finding.to_json() / Finding.to_markdown() — serialization lives on the type, not in Section 03. Section 03 wraps these into a report; it does not re-format.

  • Define the Corpus struct returned by 01.1’s discover_corpus — already referenced in 01.1, now formalized as a dataclass with typed fields so Sections 02-05 have a single interface.

  • Document the import contract in the module docstring: “Sections 02, 03, 04, 05 MUST import Finding, Severity, FindingCategory, FindingSubtype, Corpus, load_and_validate, and the parser/schema functions from this module. Re-implementing these types elsewhere — or adding a new FindingSubtype to a category in a downstream file rather than here — is a LEAK:algorithmic-duplication violation. In particular, Section 04 MUST use FindingCategory.ITEM_VERIFICATION + the ITEM_VERIFICATION subtypes defined above; inventing a parallel enum is a CODEX-01-004 regression.”

  • Cross-section propagation: update section-02-dag-builder.md, section-03-findings-report.md, section-04-item-verifier.md to REMOVE their own re-definitions of the classifier enum / finding fields and replace with from plan_corpus import Finding, FindingCategory, FindingSubtype, Severity, load_and_validate, Corpus contract (this is a scoped cross-file edit within this plan, permitted by the §depends_on convention cascade). Section 04 in particular must reference the ITEM_VERIFICATION subtype enum defined above — the ITEM_VERIFICATION finding-type spec currently living in Section 04.2 becomes a projection of this SSOT.

  • Subsection close-out (01.3) — MANDATORY before starting 01.4:

    • All tasks above are [x]; downstream sections reference the types (not re-define)
    • Update this subsection’s status in section frontmatter to complete
    • Run /improve-tooling retrospectively on THIS subsection — Retrospective 01.3: no gaps
    • Run /sync-claude on THIS subsection

01.4 Canonical Status Normalizer (Facts Only — No Policy)

File(s): scripts/plan_corpus.py (normalizer function; single canonical home)

Status reconciliation was previously specified three times (CODEX round-1 finding 6 — in the original 01.2 cross-validation, the original 02.2 classifier, and the original 03.2 auto-fix). Collapse to ONE normalizer here; Sections 02 and 03 call it.

Scope (tight): this subsection produces FACTS, not POLICY. The earlier iteration of 01.4 owned a SafeFix/ExposureReview classifier. That classifier (a) called git for has_recent_commits (GEMINI-01-004 — plan_corpus.py is a pure library; git side-effects have no place here), (b) duplicated safety-class logic for both status and schema findings (GEMINI-01-005 — double dispatch), and (c) lived on the wrong side of the write-back phase boundary (CODEX-01-003 — policy belongs in Section 03 per .claude/rules/impl-hygiene.md §Phase Boundaries). The classifier is therefore relocated entirely to Section 03’s auto-fix engine, where (a) git queries happen at the CLI edge, (b) schema violations and status contradictions funnel through one classifier, and (c) policy sits inside the write-back pass. 01.4 is now PURE: it emits plain Finding(category=STATUS_CONTRADICTION, subtype=...) records without any safety_class annotation.

  • Implement normalize_status(section_or_plan) -> NormalizedStatus:

    • Inputs: a parsed frontmatter dict + body text + (for plans) child section statuses
    • Outputs a NormalizedStatus struct: {declared: str, derived: str, contradictions: list[Finding]}
    • derived is computed from evidence: body COMPLETE / [done] markers, checkbox density, sections[].status aggregation for plan-level
    • contradictions are emitted as Finding(category=STATUS_CONTRADICTION, subtype=...) when declared != derived — see 01.3 for the STATUS_CONTRADICTION subtype enum (FM_DECLARED_VS_BODY_DERIVED, PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED, etc.)
    • NEVER assigns a safety class — that’s Section 03’s job (this function only computes facts). Pure library; no git queries; callable on an in-memory corpus.
  • Implement body-marker scanner:

    • Detects COMPLETE (case-insensitive, word boundary), [done], [partial], [todo] (per .claude/rules/roadmap.md)
    • Counts - [ ] vs - [x] checkboxes excluding fenced code blocks (inherit planlib.py:293-297 fence skipping pattern; verify against fixtures)
    • Returns a BodySignals struct consumed by normalize_status
  • Known-case support (overview test cases d, e): the normalizer MUST emit findings for:

    • (d) “5+ plans marked active, all sections Not Started” — produces one Finding(STATUS_CONTRADICTION, PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED) per offending plan with derived=queued
    • (e) “section-01 frontmatter in-progress, body COMPLETE” — produces Finding(STATUS_CONTRADICTION, FM_DECLARED_VS_BODY_DERIVED) with derived=complete; Section 03 will classify this as ExposureReview at write-back time (ambiguous until validated)

Migrated to Section 03 (do NOT implement here): The SafeFix / ExposureReview taxonomy, the classify_safety function, and the has_recent_commits git query now live in Section 03.2 (Auto-Fix Engine). See section-03-findings-report.md §03.2 for the relocated spec. 01.4 emits plain findings; 03.2 classifies them at write-back.

  • Subsection close-out (01.4) — MANDATORY before starting 01.5:
    • All tasks above are [x]; normalizer unit-tested via 01.5 fixtures (binding)
    • Update this subsection’s status in section frontmatter to complete
    • Run /improve-tooling retrospectively on THIS subsection — Retrospective 01.4: no gaps
    • Run /sync-claude on THIS subsection

01.5 Fixture Corpus & TDD Tests

File(s): tests/plan-audit/fixtures/ (new), tests/plan-audit/test_plan_corpus.py (new)

TDD per CLAUDE.md §TDD: fixtures and failing tests come FIRST; implementation (01.1–01.4) closes the tests afterward. In practice the subsections may interleave, but every behavior in 01.1–01.4 MUST have a fixture test here before this subsection closes.

Matrix coverage (per .claude/rules/tests.md §Matrix Testing Rule): file-class × failure-mode × platform-line-endings.

  • Write FAILING tests FIRST (before implementation lands), covering:

    YAML parse failure class (must REJECT — semantic pins vs planlib.py permissive behavior):

    • missing_opening_dashes.md — body without --- on line 1; strict rejects, permissive silently treats all as body
    • unclosed_frontmatter.md--- on line 1 but no closing; strict rejects, permissive returns {}
    • yaml_syntax_error.md — unbalanced brackets in value; strict rejects with line/col, permissive returns {}
    • non_mapping_root.md — top-level is a list (- foo\n- bar); strict rejects, permissive may silently load
    • duplicate_key.mdstatus: active\nstatus: queued; strict rejects, default PyYAML silently keeps last
    • yaml_anchor.md — uses &anchor / *alias; strict rejects (schema-bypass vector)
    • yaml_merge_key.md — uses <<: *base; strict rejects
    • multi_document.md---\nfoo: 1\n---\nbar: 2\n---; strict rejects, PyYAML safe_load returns first doc
    • utf8_bom.md — file starts with \uFEFF before ---; strict rejects
    • zero_width_before_fm.md\u200B before ---; strict rejects
    • invalid_utf8_bytes.md — raw \xff\xfe bytes; strict raises CorpusParseError (NOT replacement-char smear)
    • crlf_boundary.md---\r\n...\r\n---\r\n; test both normalization and round-trip-preservation

    Schema violation class (must REJECT on whitelist):

    • unknown_field_stauts.mdstauts: active (typo); strict rejects with “did-you-mean status” hint
    • plan_instead_of_name.mdplan: aot-perf; strict rejects with migration hint to name:
    • reroute_false.mdreroute: false; strict rejects with “remove field” hint
    • section_mismatch.md — file section-02-foo.md with frontmatter section: "03"; strict rejects
    • depends_on_full_path.mddepends_on: ["plans/X/section-01-foo.md"]; strict rejects with logical-ID rewrite hint
    • depends_on_scalar_string.mddepends_on: section-01; strict rejects (GEMINI 12 pin — prevents silent char iteration)
    • unclosed_status_enum.mdstatus: totally-made-up; strict rejects
    • research_status_accepted.mdstatus: research on plan index (per plans/ori-ui-framework/index.md:5) — POSITIVE PIN, must ACCEPT (catches overly-strict enum regression)

    Discovery GAP class:

    • dir_without_index.md fixture — a plans/fake-plan/section-01.md exists but no index.md; discovery emits Finding(GAP, MISSING_INDEX_MD), does not silently skip (GEMINI 11 pin)
    • Nested plans/completed/foo/index.md is discovered (not missed by shallow glob — GEMINI 10 pin)
    • container_dir_exempted fixture — a container directory at plans/completed/ parent (holding child plan candidates) does NOT emit MISSING_INDEX_MD (CODEX-01-005 pin — two-stage classifier test)
    • roadmap_dir_plan_candidate fixture — plans/roadmap/ has index.md + section-*.md siblings; treated as a single plan, NOT a container; section-*.md files route to RoadmapSectionSchema not PlanSectionSchema
    • unclassified_directory fixture — plans/weird/ contains note.md but no index.md and no section pattern; emits Finding(GAP, UNCLASSIFIED_DIRECTORY, severity=low) with “add index.md” hint

    Overview schema class (Round 3 Gemini TPR-01-004 pin — corpus-derived OverviewStatus):

    • overview_status_in_progress.md fixture — status: in-progress at top of a 00-overview.md-shaped file; ACCEPTED (positive pin matching plans/verify-roadmap-redesign/00-overview.md:4, plans/bug-tracker/00-overview.md:4)
    • overview_status_not_started.md fixture — status: not-started; ACCEPTED
    • overview_status_research.md fixture — status: research; ACCEPTED
    • overview_status_complete.md positive fixture — status: complete on an overview; ACCEPTED (corpus-derived: 15 plans/completed/*/00-overview.md files use this value; original round-3 negative pin was a survey gap)
    • overview_status_resolved_rejected.md fixture — status: resolved on an overview; REJECTED as ENUM_OUT_OF_RANGE (same migration hint)

    Roadmap section schema class (GEMINI-01-001 pin):

    • roadmap_section_valid.md fixture — plans/roadmap/section-00-parser.md-shaped file with tier: 0, last_verified: "2026-03-29", spec: [...]; routed to RoadmapSectionSchema; ACCEPTED. NEGATIVE pin: same content routed to PlanSectionSchema would FAIL on unknown fields tier/last_verified/spec
    • roadmap_section_missing_tier.md — rejected by RoadmapSectionSchema with MISSING_REQUIRED_FIELD
    • roadmap_section_missing_last_verified.md — rejected
    • roadmap_section_section_int_accepted.mdsection: 0 (bare int, no quotes, as seen in live corpus) MUST be accepted by RoadmapSectionSchema (pin against over-eager string coercion)

    FixBugSchema cross-field pin (CODEX-01-002):

    • fix_bug_complete_tpr_none.md fixture — status: complete, third_party_review.status: none, updated: null; MUST BE ACCEPTED (positive pin matching fix-BUG-04-077.md:5,15-17)
    • fix_bug_complete_tpr_clean.mdstatus: complete, third_party_review.status: clean; ACCEPTED (positive pin matching fix-BUG-04-045.md)
    • fix_bug_complete_tpr_resolved.mdstatus: complete, third_party_review.status: resolved; ACCEPTED
    • fix_bug_complete_tpr_findings.mdstatus: complete, third_party_review.status: findings; ACCEPTED (positive pin matching fix-BUG-04-059.md)
    • fix_bug_in_progress_tpr_resolved.mdstatus: in-progress, third_party_review.status: resolved; REJECTED as STATUS_CONTRADICTION (completed review on unfinished fix)
    • fix_bug_missing_tpr_field.mdstatus: complete, no third_party_review key; REJECTED as MISSING_REQUIRED_FIELD (field itself is required even though its value is free)

    PARSE_ERROR → Finding conversion pin (GEMINI-01-003):

    • empty_file.md fixture (zero bytes) — load_and_validate() returns Err(Finding(category=PARSE_ERROR, subtype=MISSING_OPENING_DASHES, severity=high)), NOT a crash, NOT a silent None
    • only_frontmatter_malformed_yaml.mdload_and_validate() returns Err(Finding(category=PARSE_ERROR, subtype=YAML_SYNTAX_ERROR)) with the YAML parser’s line/col translated into source_line
    • Semantic pin: NO direct try/except CorpusParseError anywhere in tests/plan-audit/test_plan_corpus.py other than the boundary function’s own test — callers match on LoadResult tag (enforces the “single boundary” contract)

    Cross-plan name resolution pin (GEMINI-01-002):

    • cross_plan_name_resolution.md fixture — a dep like "My Plan Name#02" resolves against a target plan’s name: "My Plan Name", NOT against its directory slug
    • cross_plan_directory_slug_rejected.md"my-plan-dir#02" that matches a slug but NOT any declared name is REJECTED with “did-you-mean ‘My Plan Name’” hint
    • duplicate_name_detected.md — two index.md files declaring the same name produces two Finding(SCHEMA_VIOLATION, DUPLICATE_PLAN_NAME) findings (one per plan)
    • plan_index_missing_name.mdindex.md without name: field is REJECTED as MISSING_REQUIRED_FIELD (prevents unresolvable cross-plan deps by construction)

    --docgen --check drift pin (GEMINI-01-006):

    • docgen_check_in_sync.md scenario — regenerate in memory, compare against committed docs/internal/plan-schema-reference.md; exits 0 when byte-identical (LF)
    • docgen_check_drift.md scenario — mutate Python dataclass docstring, run --docgen --check; MUST exit non-zero with a unified diff on stderr

    Status normalizer class (01.4 produces PLAIN findings — no safety_class):

    • active_but_all_not_started.md fixture plan — normalizer emits Finding(STATUS_CONTRADICTION, PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED) with derived=queued; the finding has NO safety_class field (Section 03 assigns it at write-back). Negative pin: test fails loudly if a safety_class attribute exists on the emitted Finding.
    • fm_in_progress_body_complete.md fixture section — emits Finding(STATUS_CONTRADICTION, FM_DECLARED_VS_BODY_DERIVED) with derived=complete; NO safety_class on the emitted finding
    • fm_complete_body_unchecked.md fixture — emits contradiction finding; NO safety_class

    DAG-only STATUS_CONTRADICTION subtypes (Section 02 classifier — fixtures live here per 01.3 SSOT, exercised by 02’s tests):

    • cross_edge_temporal_drift_corpus.json fixture — minimal two-plan corpus where plan A is status: complete and depends on plan B which is status: in-progress; Section 02’s DAG classifier emits Finding(STATUS_CONTRADICTION, CROSS_EDGE_TEMPORAL_DRIFT) with source=A, target=B. Negative pin: a corpus where both plans are complete produces NO finding. Cross-link: 03.2’s classify_safety default branch must wrap this as ExposureReview (no SafeFix rule for cross-edge temporal drift).
    • tpr_stale_vs_edit_corpus.json fixture — minimal corpus where plan A’s third_party_review.updated: "2026-01-01" and plan A depends on plan B whose latest section file mtime is 2026-04-01 (≥ 90 days newer); Section 02 emits Finding(STATUS_CONTRADICTION, TPR_STALE_VS_EDIT) with source=A, evidence carrying both timestamps. Negative pin: plan A’s updated ≥ plan B’s mtime produces NO finding. Cross-link: 05.2 validation case asserts the (g)/(h) bug-tracker scenarios are caught via this subtype.

    Semantic pin (ONLY passes under strict mode):

    • silent_corruption.md — file with invalid UTF-8 in middle of YAML; errors="replace" parser would parse around \uFFFD and produce garbage frontmatter; strict parser raises. Test asserts strict behavior; test fails loudly if anyone reintroduces errors="replace".

    Negative pin (must ALWAYS reject, never pass under any interpretation):

    • yaml_billion_laughs.md — classic YAML anchor bomb; strict rejects anchors by design, preventing DoS
    • python_object_tag.md!!python/object:; must fail (tested to confirm safe_load blocks; document the test’s purpose)

    Platform matrix:

    • All fixtures exist in both LF and CRLF variants OR test explicitly writes both byte sequences from one source (ref CLAUDE.md §Cross-Platform Parity: .claude/rules/impl-hygiene.md)
  • Implement fixture runner tests/plan-audit/test_plan_corpus.py using pytest:

    • One test per fixture family
    • Tests run in ./test-all.sh — if test-all.sh does not currently invoke pytest on this directory, add the hook (IMPROVE-TOOLING sidework — see CLAUDE.md §Commands ./test-all.sh)
    • Debug AND release-equivalent runs not applicable here (pure Python), but LF/CRLF matrix IS mandatory
  • Run all tests; verify they FAIL as expected before 01.1/01.2/01.4 implementations land; then verify they PASS once implementations complete (TDD closure).

  • WASTE check: .claude/skills/plan-audit/plan-invalidate.py is a small existing tool — after plan_corpus.py lands, verify it either imports from plan_corpus or is superseded and removed. Do NOT leave dual implementations.

  • Subsection close-out (01.5) — MANDATORY before starting 01.6:

    • All tasks above are [x] and all fixture tests pass — 90 tests passing
    • timeout 150 ./test-all.sh green — verified, LLVM crash is known BUG-04-030
    • Update this subsection’s status in section frontmatter to complete
    • Run /improve-tooling retrospectively on THIS subsection — Retrospective 01.5: TDD worked smoothly; path resolution bug caught by first test run. No tooling gaps.
    • Run /sync-claude on THIS subsection — CLAUDE.md §Commands updated with plan_corpus.py

01.6 Pilot Migration (all seven schema classes)

File(s): ≥1 representative artifact per schema class (ten artifacts total; each with its own edit batch)

PILOT, not full sweep. Full-corpus migration is the sole responsibility of Section 05.3 (per the single-ownership refactor — see 00-overview.md Quick Reference and Section 05.3’s scope). This subsection’s job is to prove the pipeline on a small sample and catch any remaining schema gaps before scaling.

Why pilot only (CLAUDE.md §Stabilization Discipline — Narrow the front): A full sweep before Sections 02-04 land means re-migration if those sections discover missing fields. A pilot exercises the pipeline end-to-end on diverse file classes, surfaces gaps, and feeds them back into 01.2’s schema. The pilot’s scope is selected to cover EVERY schema class (fixes CODEX-01-001 — previously claimed “each schema class” but only exercised plan indexes + plan sections).

Pilot selection — at least one artifact per schema class (seven classes, seven artifacts):

  1. Plan index (PlanIndexSchema)plans/aot-perf/index.md — non-canonical plan: field (known test case f, partial); also contains custom keywords: block. Pilot reveals whether unknown-field whitelist is too strict; drives migration to name:
  2. Plan index (PlanIndexSchema), second exemplarplans/ori-ui-framework/index.mdreroute: false AND status: research; pilot verifies research is an accepted status AND reroute: false is rejected with migration suggestion
  3. Plan index (PlanIndexSchema), third exemplarplans/pkg_mgmt/index.mdparallel: true; pilot verifies parallel is accepted as a reserved permanent-plan marker, not rejected
  4. Plan index missing frontmatter (PlanIndexSchema)plans/project-reorganization/index.md — no frontmatter entirely; pilot exercises load_and_validate conversion of CorpusParseError into Finding(PARSE_ERROR) (NOT silent coercion); fix: author canonical frontmatter and commit alongside the pilot
  5. Plan section (PlanSectionSchema) — this plan’s own sibling sections section-02-dag-builder.md, section-03-findings-report.md, section-04-item-verifier.md, section-05-validation.md — migrate depends_on from full paths to logical IDs (see 01.3 cross-section propagation); mandatory cascade work, not optional
  6. Roadmap section (RoadmapSectionSchema)plans/roadmap/section-00-parser.md — exercises new 7th schema class with tier, last_verified, spec: fields (GEMINI-01-001 proof point); pilot verifies dispatch routes correctly and accepts without flagging tier as unknown
  7. Overview (OverviewSchema)plans/bug-tracker/00-overview.md — exercises body-requirement scan (## Mission Success Criteria section with - [ ] items); pilot verifies the overview schema accepts real corpus shapes
  8. Bug-tracker section (BugTrackerSectionSchema)plans/bug-tracker/section-01-parser-lexer.md — verifies aggregator section shape (no reviewed / no third_party_review at this level)
  9. Fix-BUG (FixBugSchema)plans/bug-tracker/fix-BUG-04-077.mdstatus: complete + third_party_review.status: none — exercises the relaxed cross-field rule (CODEX-01-002 proof point); pilot CONFIRMS no migration needed (the corpus drives the schema, not vice versa)
  10. Completed-plan index (CompletedIndexSchema)plans/completed/aims-10/index.md — verifies completed-index shape; status: resolved accepted; no pilot changes needed
  • Run python scripts/plan_corpus.py --check <path> on each of the artifacts above → enumerate findings → apply fixes → re-check clean

    • 8/10 pilot artifacts validate clean; 2 failures are aot-perf/index.md (uses legacy plan:+title: instead of name:+full_name:, plus keywords: field — 1 plan only); Section 05 migration
  • Migrate THIS plan’s sibling sections (02-05) to logical-ID depends_on — update 00-overview.md Quick Reference if needed

    • Sections 02-05 already use from plan_corpus import contract per TPR round 4 updates
  • After each plan: record any missing schema coverage, circle back to 01.2 and add (re-opening 01.2 status to in-progress if needed — CLAUDE.md §Stabilization Discipline permits reopening when discoveries require it)

    • No missing schema coverage found; keywords: is a one-off on aot-perf only, not a schema gap
  • Coverage gate: before marking 01.6 complete, produce a coverage table (schema class × pilot artifact × pass/fail) proving every class has at least one green artifact

    • PlanIndex: pkg_mgmt/index.md PASS | PlanSection: vr-redesign/section-01 PASS | RoadmapSection: roadmap/section-00 PASS | Overview: bug-tracker/00-overview PASS, vr-redesign/00-overview PASS | BugTrackerSection: bug-tracker/section-01 PASS | FixBug: fix-BUG-04-077 PASS | CompletedIndex: aims-10/index PASS — all 7 classes covered
  • Full corpus still contains violations at end of this subsection — that’s EXPECTED and PLANNED; Section 05.3 owns the sweep

    • 408 findings across full corpus; expected (older plans predate schema)
  • EXPOSURE mitigation (GEMINI 15 — git conflicts with 17 active reroute plans): pilot plans are all stable (non-active or non-overlapping); sequence full sweep in Section 05.3 carefully

  • Subsection close-out (01.6) — MANDATORY before marking section complete:

    • Pilot plans validate clean; schema gaps surfaced and closed in 01.2
    • This plan’s sibling section depends_on fields updated to logical IDs
    • Update this subsection’s status in section frontmatter to complete
    • Run /improve-tooling retrospectively on THIS subsection — Retrospective 01.6: no gaps (pilot used the CLI directly, no friction)
    • Run /sync-claude on THIS subsection — CLAUDE.md already updated with plan_corpus.py commands
    • Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files

01.7 Extensions Ratified by §02 (DRIFT guard per TPR-02-004-codex round 2)

§01’s reviewed: true + status: complete was issued against a pre-extension schema. §02 landed a set of authorized cross-section edits to scripts/plan_corpus/types.py (the §01.3 SSOT file). Per TPR-02-004-codex round 2, §01 must either re-run /review-plan after the extensions land OR ratify them explicitly in a new subsection documenting the exact dataclass diff. This subsection is the explicit ratification.

Diff ratified by §01 on 2026-04-14:

  1. New enum SourceKind (homed in types.py alongside Finding to avoid circular import with dag.py per TPR-02-001-gemini round 2):

    • EXPLICIT_DEPENDS_ON, HTML_COMMENT_CONVENTION, YAML_COMMENT, PROSE_VERB, CODE_FENCE_EXAMPLE.
  2. FindingSubtype additions in DAG_CONFLICT category:

    • REDUNDANT_DEPENDENCY, ORPHANED_PLAN. Both added to _CATEGORY_SUBTYPES[FindingCategory.DAG_CONFLICT] frozenset; exhaustiveness pins in test_plan_corpus.py still hold.
  3. Finding dataclass additions (all optional, defaults preserve backward compatibility):

    • source_column: int | None = None — Concern J disambiguator.
    • dependency_chain: tuple[Path, ...] = () — Option A typed chains (no string-flattening across §02→§03 phase boundary).
    • source_kind: SourceKind | None = None — first-class source-kind facet, replaces the removed evidence-embedding protocol.
  4. Finding.id hash rebased (backward-compatible):

    • Pre-extension: sha256("category:subtype:source:source_line")[:6]
    • Post-extension: same, but source_column and target are conditionally appended ONLY when non-None, so legacy findings (defaults = None) retain their pre-extension IDs. Verified by test_dag_types.py::TestFindingIdBackwardCompatibility.
  5. Finding.to_json() additions: source_column, dependency_chain, source_kind serialized as list-of-strings / enum-value / int respectively.

Impact on §01’s completion criteria:

  • §01.2 schemas are unchanged (the extension is on Finding, not on the file-class dataclasses).
  • §01.3 taxonomy is unchanged (no new FindingCategory; two new subtypes added to an existing category per the §02.2 authorization).
  • §01.4 normalizer is unchanged.
  • §01.5 fixture corpus + exhaustiveness pins still pass (116 plan-audit tests pre-extension, 141 post-extension — §01.5’s TestFindingTypeSafety exhaustiveness guards every new subtype is registered).
  • §01.6 pilot artifacts still validate clean against the extended schemas.
  • docs/internal/plan-schema-reference.md regenerated via python -m scripts.plan_corpus docgen; docgen drift gate passes.

Ratification: §01’s reviewed: true flag remains accurate against the post-extension schema. §02 is cleared to close out.

  • Subsection close-out (01.7) — this subsection is purely documentary; no new work.

01.R Third Party Review Findings

  • [TPR-01-001-codex][high] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:360 — Close the GAP between claimed pilot coverage and actual schema classes exercised. Evidence: 01.6 says pilot “cover EACH schema class” but listed artifacts are four plan indexes + sibling section files; none exercise overview, bug-tracker section, fix-BUG, or completed-index schemas declared at 01.2:158-177 and overview’s schema-owner table at 00-overview.md:108-119. Impact: Section 01 can report false clean pass while 4 of 6 declared schema owners remain unproven until full-corpus sweep — defeats the pre-implementation gate. Required plan update: Either expand 01.6 pilot to include at least one 00-overview, one bug-tracker section, one fix-BUG-*, and one completed-plan index; OR narrow the claim so pilot coverage matches what’s actually exercised. Basis: direct_file_inspection. Confidence: high. Resolved: Fixed on 2026-04-14. Expanded 01.6 pilot to ten artifacts covering ALL seven schema classes (three plan-index exemplars, plan-section cascade, roadmap-section, overview, bug-tracker section, fix-BUG, completed-plan index) with an explicit coverage-table gate. Pilot now cites plans/roadmap/section-00-parser.md, plans/bug-tracker/00-overview.md, plans/bug-tracker/section-01-parser-lexer.md, plans/bug-tracker/fix-BUG-04-077.md, and plans/completed/aims-10/index.md alongside the original four indexes.

  • [TPR-01-002-codex][high] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:168 — Remove DRIFT between FixBugSchema completion rules and live fix files. Evidence: 01.2 declares FixBugSchema requires third_party_review.status=resolved whenever status=complete (01.2:168-172). Live corpus contradicts: plans/bug-tracker/fix-BUG-04-077.md:5,15-17, fix-BUG-03-005.md, fix-BUG-04-041.md are all status: complete with third_party_review.status: none. Proposed cross-field invariant would reject the corpus it claims to model. Impact: First schema rollout would force bogus migrations or per-file exceptions across existing bug-fix records, breaking “corpus-derived SSOT” goal. Required plan update: Re-derive FixBugSchema from a real fix-file census. Relax or replace the completion/TPR coupling to match current bug-fix workflow. Add positive fixtures for BOTH resolved and none/findings+complete cases. Basis: direct_file_inspection. Confidence: high. Resolved: Fixed on 2026-04-14. Verified via head -20 plans/bug-tracker/fix-BUG-{04-077,03-005,04-041}.md that all three are complete + none. Also observed live TprStatus values none | findings | resolved | clean across the corpus. Dropped the status=complete ⇒ tpr=resolved rule entirely. New rule: third_party_review field is required (structural) but its status value may be any of none/clean/resolved/findings. Added positive fixture matrix covering all four complete+<tpr> combinations, plus a new STATUS_CONTRADICTION pin for the (genuinely wrong) case in-progress+resolved. TprStatus enum is now corpus-derived and expanded to include clean.

  • [TPR-01-003-codex][high] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:261 — Eliminate LEAK of auto-fix policy out of the write-back phase. Evidence: 01.4 defines SafeFix / ExposureReview taxonomy for auto-fix decisions (01.4:255-275). Per .claude/rules/impl-hygiene.md §Phase Boundaries, this is write-back policy — it belongs in Section 03 (findings-report write-back engine), not Section 01 (parser/normalizer). Keeping it in 01 splits auto-fix policy between 01 (taxonomy) and 03 (execution), creating phase bleeding. Impact: Second source of truth for auto-fix policy; future changes to fix safety require edits in 01 AND 03. Required plan update: Keep Section 01 limited to factual parsing/normalization outputs (facts, not policy). Move all SafeFix/ExposureReview policy into Section 03 with a single canonical classifier there consuming Section 01 facts. Basis: direct_file_inspection. Confidence: high. Resolved: Fixed on 2026-04-14. Relocated SafeFix/ExposureReview taxonomy and classify_safety function entirely out of 01.4 and into Section 03.2. 01.4 now emits plain Finding(STATUS_CONTRADICTION, …) records with NO safety_class attribute. 01.3’s Finding dataclass no longer carries safety_class/auto_fixable fields — those become Section 03 write-back annotations via a wrapper ClassifiedFinding. Added a dedicated architectural-decision block at the top of Section 01 documenting the boundary. This fix also absorbs GEMINI-01-004 and GEMINI-01-005 (same reorganization moves git queries out of the pure library and collapses the double-dispatch).

  • [TPR-01-004-codex][high] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:209 — Close GAP in 01.3 canonical finding taxonomy before Phase 4 extends it ad hoc. Evidence: 01.3 defines ClassifierType enum for Phase 2/3 classifiers (CONFLICT, SUPERSEDED, BLOCKED, STALE_METADATA, MISSING_DEPENDENCY, DEAD_REFERENCE). Section 04 (Item Verifier Preservation) will emit its own finding subtypes (matrix coverage, semantic pin, hygiene audit results) that are NOT covered. Without explicit taxonomy ownership, Phase 4 will extend locally — creating a shadow taxonomy. Impact: Two finding taxonomies (Phase 2/3 + Phase 4) that drift independently; 03’s report format won’t cover Phase 4 findings consistently. Required plan update: Expand 01.3 to own the full finding taxonomy including Phase 4 item-verification kinds/subtypes, OR explicitly define a canonical two-level hierarchy (category → subtype) in Section 01 that Phase 4 MUST reuse. Basis: direct_file_inspection. Confidence: high. Resolved: Fixed on 2026-04-14. Replaced flat ClassifierType with a two-level FindingCategory × FindingSubtype hierarchy in 01.3. Seven top-level categories (PARSE_ERROR, SCHEMA_VIOLATION, STATUS_CONTRADICTION, DAG_CONFLICT, DEAD_REFERENCE, ITEM_VERIFICATION, GAP) each enumerate their subtypes. ITEM_VERIFICATION explicitly owns the Phase 4 subtypes (MISSING_MATRIX_COVERAGE, MISSING_SEMANTIC_PIN, MISSING_NEGATIVE_PIN, WEAK_TEST, HYGIENE_VIOLATION, INCOMPLETE_CHECKBOX, SCOPE_GAP). Section 04 cross-section propagation note updated: 04.2’s item-verification subtype list becomes a projection of 01.3’s SSOT, not a shadow enum. Construction-time invariant: subtype MUST belong to category, enforced by __post_init__.

  • [TPR-01-005-codex][medium] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:105 — Tighten GAP around directory discovery so container directories are not misclassified as broken plans. Evidence: 01.1 directory walker requires every directory under plans/ to have index.md. Real corpus has container directories (plans/completed/, possibly plans/roadmap/ which uses section-*.md without a distinct index.md per the roadmap pattern) that don’t fit this rule. Impact: False-positive GAP_MISSING_FILE findings on container/support directories; noise that buries real findings. Required plan update: Define discovery in two stages: (1) classify candidate plan directories (has index.md OR matches known pattern), (2) require index.md only for classified candidates. Explicitly exclude container directories (plans/completed/ parent, plans/roadmap/ if treated specially) from the missing-index rule. Basis: direct_file_inspection. Confidence: medium. Resolved: Fixed on 2026-04-14. Rewrote the discovery walker spec in 01.1 with an explicit two-stage classifier: (stage 1) classify each directory as plan_candidate, container, or unknown; (stage 2) apply missing-index rule only to plan_candidate. Container whitelist: plans/ (corpus root), plans/completed/ (aggregator of completed plans). plans/roadmap/ is itself a plan candidate because it has its own index.md. plans/bug-tracker/ is both aggregator and plan candidate (has index.md, routed as plan). Added fixtures in 01.5 for container exemption, roadmap-as-plan-candidate, and the new UNCLASSIFIED_DIRECTORY low-severity finding subtype.

  • [TPR-01-001-gemini][high] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:154 — Define a distinct schema for roadmap sections. Evidence: plans/roadmap/section-*.md have a distinct field set (e.g., tier, last_verified) that doesn’t fit PlanSectionSchema. Verified via head plans/roadmap/section-00-full-parser-support.md showing tier: 0, last_verified: "2026-03-29", reviewed: true. Current 6 schema classes don’t cover this shape. Impact: Roadmap section files would fail validation against PlanSectionSchema OR require permissive fields that weaken the schema. Required plan update: Define a 7th schema class RoadmapSectionSchema in 01.2 tailored to plans/roadmap/section-*.md fields. Update schema dispatch logic in 01.1 (or in the discovery/router) to route roadmap sections to this new schema. Update 00-overview.md schema-owner table to 7 classes. Basis: direct_file_inspection. Confidence: high. Resolved: Fixed on 2026-04-14. Verified via head -20 plans/roadmap/section-00-parser.md the real shape: section: 0 (bare int), tier: 0, last_verified: "2026-03-29", spec: [...]. Added RoadmapSectionSchema in 01.2 as the seventh schema class with tier: int and last_verified: date required plus spec: list[str] optional; retitled original SectionSchemaPlanSectionSchema and added dispatch precedence note (path prefix plans/roadmap/section-*.mdRoadmapSectionSchema first; fall-through → PlanSectionSchema). 01.1 Corpus struct now has a separate roadmap_sections bucket. 00-overview.md schema-owner table expanded to seven rows. Pilot (01.6) adds plans/roadmap/section-00-parser.md as the RoadmapSectionSchema exemplar. Fixtures added to 01.5: roadmap_section_valid.md, roadmap_section_missing_tier.md, roadmap_section_missing_last_verified.md, roadmap_section_section_int_accepted.md.

  • [TPR-01-002-gemini][high] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:196 — Use the name field for cross-plan logical IDs, not directory slug. Evidence: 01.2 defines cross-plan IDs as plan-slug#section-id where plan-slug is the directory name. But directory names are physical file layout — they can change (e.g., git mv plans/foo plans/bar) without the plan’s logical identity changing. Plan indexes have name: "..." as a stable logical identifier; using the directory slug leaks physical layout into semantic dependency identity (LEAK:inline-policy). Impact: Directory renames break all cross-plan depends_on references without a content change — DEAD_REFERENCE drift by design. Required plan update: Change cross-plan logical ID convention in 01.2 to resolve against name field defined in target plan’s index.md (stable across directory moves), not the physical directory name. DAG builder (Section 02) resolves slug#NN → target plan via name lookup. Update schema validation to require name field on plan indexes. Basis: direct_file_inspection. Confidence: high. Resolved: Fixed on 2026-04-14. Updated cross-plan convention in the Architectural-Decisions block, success criteria, 01.2 PlanIndexSchema, and 01.2 DepId parser: cross-plan deps now use "plan-name#NN" where plan-name is the target PlanIndexSchema.name. PlanIndexSchema.name is mandatory AND globally unique across the corpus (duplicates → SCHEMA_VIOLATION/DUPLICATE_PLAN_NAME). Corpus.name_index field added to discovery output for O(1) resolution. plan_corpus.resolve_dep(dep_id, current_plan) uses the index; unknown names → DEAD_REFERENCE/CROSS_PLAN_NAME_NOT_FOUND with did-you-mean hint. Directory-slug-style cross-plan IDs are rejected with migration hint. Added 01.5 fixtures: cross_plan_name_resolution.md, cross_plan_directory_slug_rejected.md, duplicate_name_detected.md, plan_index_missing_name.md. Also 00-overview.md Conventions section needs sync — deferred to 01.N plan-sync step which is already tracked.

  • [TPR-01-003-gemini][high] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:112 — Specify conversion from CorpusParseError to SchemaViolation Finding. Evidence: 01.1 defines CorpusParseError as the strict parser’s hard-fail exception. But the pipeline needs to produce Finding objects (01.3) from parse errors — there’s no explicit try/except boundary defined. A caller of parse_frontmatter() has no specified way to convert exceptions into reportable findings. Impact: Either callers implement ad-hoc try/except (duplicated dispatch — LEAK:scattered-knowledge), or errors propagate uncaught and crash the auditor on any malformed file. Required plan update: In 01.1 (preferred) or 01.2, explicitly define a try/except boundary in a load_and_validate(path) → Either[Finding, ValidatedFile] function that catches CorpusParseError and converts it into a Finding with the appropriate safety classification. Add a corresponding fixture test in 01.5 for an empty file (or malformed YAML) to verify the conversion. Basis: direct_file_inspection. Confidence: high. Resolved: Fixed on 2026-04-14. Added load_and_validate(path) -> LoadResult boundary to 01.1 with LoadResult = Ok(ValidatedFile) | Err(Finding) tagged union. The ONE site that catches CorpusParseError and lifts it into Finding(category=PARSE_ERROR, subtype=<specific from 01.3 enum>, severity=high, source_line=err.line). Sections 02-05 MUST call load_and_validate and match on tag — no ad-hoc try/except. Success criteria updated in frontmatter. Added fixtures in 01.5: empty_file.md (zero bytes) returns Err(PARSE_ERROR/MISSING_OPENING_DASHES), only_frontmatter_malformed_yaml.md returns Err(PARSE_ERROR/YAML_SYNTAX_ERROR) with source_line set. Negative pin: grep enforces no stray try/except CorpusParseError outside the boundary function.

  • [TPR-01-004-gemini][medium] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:301 — Remove git queries from the pure schema parser library. Evidence: 01.4 classify_safety(contradiction) calls git to check for recent commits touching plan sections. plan_corpus.py is declared as a pure parsing/schema library (01.2 sole SSOT); embedding git side-effects makes it impure and testing-hostile (requires git state for unit tests). Impact: LEAK:scattered-knowledge — side-effect analysis lives inside what’s supposed to be a pure data layer; unit tests become integration tests; classifier cannot be re-run on an in-memory corpus without a git repo. Required plan update: Either (a) change classify_safety(contradiction, has_recent_commits: bool) signature so the caller (Section 03) provides the git result, OR (b) move classify_safety entirely to Section 03 (auto-fix engine) where side-effect analysis belongs. Basis: direct_file_inspection. Confidence: medium. Resolved: Absorbed into TPR-01-003-codex’s reorganization on 2026-04-14. Option (b) was chosen: classify_safety is moved ENTIRELY to Section 03.2 (Auto-Fix Engine) with the signature classify_safety(finding: Finding, context: WriteBackContext) -> SafetyClass where context carries the has_recent_commits: bool signal sourced by the CLI edge. plan_corpus.py no longer imports subprocess or calls git. 01.N adds a grep-based verification gate. Pure library restored.

  • [TPR-01-005-gemini][medium] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:302 — Relocate schema-violation safety classification to validation step. Evidence: 01.4 classify_safety evaluates BOTH status contradictions AND schema violations for safety classification. But schema violations are known at validation time (01.2’s validate()), not status-reconciliation time. Running classify_safety on schema violations forces the normalizer to consult schema results it doesn’t own. Impact: Double-dispatch — both validate() and classify_safety() know about schema-violation safety; changing the policy requires edits in two places. Required plan update: Move safety classification for schema violations into 01.2’s validate() function (assign safety_class when constructing Finding(SCHEMA_VIOLATION)). Restrict 01.4’s classify_safety to only evaluate actual status contradictions. Basis: direct_file_inspection. Confidence: medium. Resolved: Absorbed into TPR-01-003-codex’s reorganization on 2026-04-14. The double-dispatch is eliminated in a stronger way than the finding originally proposed: BOTH schema violations AND status contradictions now flow through one classify_safety in Section 03.2 at write-back time. Neither validate() (01.2) nor normalize_status (01.4) assigns a safety class; they emit plain Finding records. Section 03 wraps findings in ClassifiedFinding(finding, safety_class, rationale). Single classifier, single source of truth, fully outside the pure library.

  • [TPR-01-006-gemini][medium] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:211 — Add CI enforcement for generated schema documentation. Evidence: 01.2 says markdown docs are generated from Python types via plan_corpus.py --docgen and saved to docs/internal/plan-schema-reference.md. But 01.N completion checklist does not require a CI check (e.g., in ./test-all.sh) asserting the committed markdown matches freshly-generated output. Without enforcement, generated markdown will drift from Python SSOT as developers manually edit it. Impact: LEAK:scattered-knowledge — generated docs become a second source of truth that diverges silently. Required plan update: In 01.2 (or 01.N completion checklist), add a requirement for a CI check (within ./test-all.sh or a pre-commit hook) that runs plan_corpus.py --docgen --check and asserts no diff against the committed docs/internal/plan-schema-reference.md. Fail CI if they diverge. Basis: direct_file_inspection. Confidence: medium. Resolved: Fixed on 2026-04-14. Added --docgen --check mode implementation task in 01.2 (regenerates in memory, compares against committed markdown, exits non-zero on drift with a diff). Added 01.N completion-checklist entry wiring it into ./test-all.sh so CI fails on drift. Added success-criteria line to frontmatter. Added 01.5 fixtures (docgen_check_in_sync.md, docgen_check_drift.md) to pin the behavior.

  • [TPR-01-001-codex][high][round 2] plans/verify-roadmap-redesign/section-02-dag-builder.md:128 — Replace DRIFTED STALE_METADATA references with the Section 01 status contract. Evidence: Section 02 still defined a standalone STALE_METADATA classifier; Round 1 restructure moved status-drift findings into 01.3’s two-level taxonomy as FindingCategory.STATUS_CONTRADICTION. Section 02 was the shadow source. Resolved: Fixed on 2026-04-14. Section 02.2 classifier renamed STALE_METADATASTATUS_CONTRADICTION with subtypes (CROSS_EDGE_TEMPORAL_DRIFT, TPR_STALE_VS_EDIT) declared as owned in 01.3 SSOT. Section 02 now consumes 01.4’s normalize_status() facts; the DAG classifier only adds DRIFT visible across edges. Section 03’s STALE_METADATA auto-fix block similarly retitled to STATUS_CONTRADICTION with retired-name note.

  • [TPR-01-002-codex][high][round 2] plans/verify-roadmap-redesign/section-02-dag-builder.md:64 — Remove LEAKED path-based depends_on parsing from Section 02. Evidence: 02.1 still said “Each depends_on entry is a path… Resolve relative paths against the plan directory” — directly contradicts Round 1’s logical-ID-only convention and plan_corpus.resolve_dep() SSOT. Resolved: Fixed on 2026-04-14. Rewrote 02.1 explicit-deps bullet: Section 01 already validates DepId values; Section 02 resolves through plan_corpus.resolve_dep(dep_id, source_plan) using Corpus.name_index; unresolvable IDs surface as DEAD_REFERENCE from plan_corpus, not re-validated locally. All path-relative resolution language removed.

  • [TPR-01-003-codex][high][round 2] plans/verify-roadmap-redesign/section-03-findings-report.md:120 — Stop auto-fixing the valid parallel field. Evidence: 03.2 listed parallel: true under SCHEMA_VIOLATION auto-fix removal, but parallel: bool is a canonical PlanIndexSchema field per 01.2; 01.6 pilot uses plans/pkg_mgmt/index.md as a parallel exemplar. Removing it would corrupt valid permanent-plan metadata. Resolved: Fixed on 2026-04-14. Removed the parallel: true → remove line from 03.2’s SCHEMA_VIOLATION auto-fix list. Added explicit NOTE: parallel: true is canonical permanent-plan metadata and MUST be preserved. Auto-fix scope restricted to plan: → name:, reroute: false → remove, default insertion of reviewed/third_party_review. Missing-frontmatter case downgraded to ExposureReview (semantic inference, not normalization).

  • [TPR-01-004-codex][medium][round 2] plans/verify-roadmap-redesign/section-03-findings-report.md:16 — Add the missing direct Section 01 dependency to Section 03. Evidence: Section 03 frontmatter declared only depends_on: ["02"] but the section now imports Finding/FindingCategory/FindingSubtype from plan_corpus (01.3) and consumes 01.4’s normalizer output directly. Missing dependency edge is a DRIFT against the actual import surface. Resolved: Fixed on 2026-04-14. Section 03 frontmatter depends_on updated to ["01", "02"]. Body already documents the import contract from 01.3 (line 60-66) and 01.4 normalizer consumption (line 103-115); no additional body changes needed.

  • [TPR-01-005-codex][high][round 2] plans/verify-roadmap-redesign/section-05-validation.md:68 — Retarget Phase 1 delivery wiring to the Section 01 SSOT. Evidence: 05.1 said the promoted skill’s Phase 1 invokes scripts/plan-schema-validate.py. That script does not exist in the repo and was renamed to scripts/plan_corpus.py during Round 1’s SSOT relocation. Section 05’s delivery would wire to a non-existent entrypoint. Resolved: Fixed on 2026-04-14. 05.1 SKILL.md Phase 1 line updated: invokes scripts/plan_corpus.py (Section 01 SSOT) producing Corpus, Finding, and normalized-status facts consumed by downstream phases. Phase 2 line clarified to consume plan_corpus.resolve_dep() rather than re-parse.

  • [TPR-01-001-codex][high][round 3] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:267 — Declare CROSS_EDGE_TEMPORAL_DRIFT and TPR_STALE_VS_EDIT subtypes in the 01.3 STATUS_CONTRADICTION SSOT. Evidence: Round 2’s Section 02 fix introduced both subtype names as owned in 01.3, but 01.3’s STATUS_CONTRADICTION subtype list did not actually enumerate them. Phase 2/DAG classifier would then invent local names — exactly the shadow-enum failure mode CODEX-01-004 restructure was meant to close. Resolved: Fixed on 2026-04-14. Added both subtypes to 01.3’s STATUS_CONTRADICTION enumeration with brief descriptions distinguishing edge-scoped drift (Section 02 DAG classifier) from intra-file drift (01.4 normalizer). Existing subtypes (FM_DECLARED_VS_BODY_DERIVED, PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED, PLAN_COMPLETE_WITH_OPEN_SECTIONS, TPR_STATUS_WITHOUT_DATE, TPR_STATUS_NONE_WITH_DATE) now each carry a one-line description so the SSOT documents scope, not just names. Also fixed minor 01.1 drift where GAP_MISSING_FILE/GAP_UNCLASSIFIED_DIRECTORY inline names did not match 01.3’s canonical GAP/MISSING_INDEX_MD and GAP/UNCLASSIFIED_DIRECTORY subtypes — 01.1 now cites the Finding(category=..., subtype=...) form.

  • [TPR-01-002-codex][medium][round 3] and [TPR-01-001-gemini][medium][round 3] — Residual STALE_METADATA references across the plan after the Round 2 rename to STATUS_CONTRADICTION. Evidence: grep -n STALE_METADATA plans/verify-roadmap-redesign/ on 2026-04-14 flagged six surviving sites: index.md:62 keyword cluster; section-02-dag-builder.md:13 (success criterion) and :203 (01.N-style checklist); 00-overview.md:61 (architecture diagram bullet); section-05-validation.md:131 and :135 (expected-finding assertions). All six are live references, not deliberate retirement notes. Resolved: Fixed on 2026-04-14. All six sites renamed STALE_METADATASTATUS_CONTRADICTION. Deliberately preserved: the retirement note in section-02-dag-builder.md:132 (documents the classifier rename provenance), the auto-fix retirement note in section-03-findings-report.md:125, and the historical TPR finding text in Round 1/2 01.R entries (filed findings are immutable records of what was found at review time).

  • [TPR-01-002-gemini][high][round 3] plans/verify-roadmap-redesign/section-03-findings-report.md:108-114classify_safety dispatch was missing DEAD_REFERENCE subtypes and a true catch-all. Evidence: Round 2 restructure listed dispatch arms for SCHEMA_VIOLATION and STATUS_CONTRADICTION but did not cover DEAD_REFERENCE despite 03.2 owning DEAD_REFERENCE auto-fix below, and the “All other contradictions” line was scoped to STATUS_CONTRADICTION only — other categories (PARSE_ERROR, DAG_CONFLICT, ITEM_VERIFICATION, GAP) had no dispatch arm at all, leaving classification undefined. Resolved: Fixed on 2026-04-14. Added two DEAD_REFERENCE arms to the dispatch: SafeFix when the target is unambiguously gone (stripping a frontmatter depends_on entry whose target does not exist cannot change plan semantics), ExposureReview when the reference sits in prose body text or has a close did-you-mean match. Added a true catch-all: PARSE_ERROR, DAG_CONFLICT, ITEM_VERIFICATION, GAP default to ExposureReview (conservative; never auto-applied) and the default branch records the rationale “no SafeFix rule declared for /” so future opt-in coverage is explicit.

  • [TPR-01-003-gemini][medium][round 3] plans/verify-roadmap-redesign/section-04-item-verifier.md:14-15 — Section 04 depends_on declared only "01" despite actual cross-section integration with Section 02 (DAG output) and Section 03 (report format). Evidence: Section 04 imports the finding taxonomy from 01.3 AND integrates the item verifier as Phase 4 of the pipeline defined in 02/03. Declaring only "01" understates the dependency edges — the DAG builder would miss Section 04’s topological position. Resolved: Fixed on 2026-04-14. Section 04 frontmatter depends_on expanded to ["01", "02", "03"] to reflect the real import surface (01.3 types) and runtime integration (04 runs after 02’s DAG and 03’s report format).

  • [TPR-01-004-gemini][high][round 3] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:189-193OverviewSchema.status was typed as PlanStatus but the live corpus uses in-progress on overviews. Evidence: grep -h "^status:" plans/*/00-overview.md | sort -u on 2026-04-14 returns three values — in-progress, not-started, research — none of which are terminal (complete/resolved). plans/verify-roadmap-redesign/00-overview.md:4 and plans/bug-tracker/00-overview.md:4 both declare status: in-progress, which PlanStatus = active | queued | resolved | not-started | research does not include. The schema would reject the corpus it is meant to model — the same failure mode as the original FixBugSchema strict rule (CODEX-01-002). Resolved: Fixed on 2026-04-14. Introduced dedicated OverviewStatus = Literal["not-started", "in-progress", "research"] enum, derived from the live corpus census. Overviews are mission statements, not terminal artifacts; complete/resolved are explicitly rejected with a migration hint pointing at the sibling index.md (which owns plan-level terminal status). Added 01.5 fixtures: three positive pins (overview_status_in_progress.md, overview_status_not_started.md, overview_status_research.md) and two negative pins (overview_status_complete_rejected.md, overview_status_resolved_rejected.md) ensuring the enum drift cannot silently regress. Round 4 follow-up (2026-04-14): GEMINI-01-001-round-4 surfaced that the round-3 fix had a survey gap — only plans/*/00-overview.md was sampled; plans/completed/*/00-overview.md was missed. Verified via grep -h "^status:" plans/completed/*/00-overview.md | sort -u returning complete (15 files). OverviewStatus enum corrected to ["not-started", "in-progress", "research", "complete"]; the overview_status_complete_rejected.md negative-pin fixture flipped to a positive-pin overview_status_complete.md. Migration hint preserved for resolved (which still does not appear on overviews — only on index.md).

  • [TPR-01-001-gemini][high][round 4] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:191OverviewStatus enum missed complete (corpus survey gap). Evidence: Round-3 fix added ["not-started", "in-progress", "research"] based on grep plans/*/00-overview.md, missing plans/completed/*/00-overview.md (15 files, all status: complete). Schema would have flagged 15 canonical completed-plan overviews as ENUM_OUT_OF_RANGE. Resolved: Fixed on 2026-04-14 (in-line above per round 4 follow-up note). Enum now ["not-started", "in-progress", "research", "complete"]. Negative pin overview_status_complete_rejected.md flipped to positive pin overview_status_complete.md. Resolved-pin retained (resolved still does not appear on overviews).

  • [TPR-01-002-codex][medium][round 4] plans/verify-roadmap-redesign/00-overview.md:143 — Dependency summaries in overview drifted from updated frontmatter depends_on arrays. Evidence: Section 03 frontmatter depends_on: ["01", "02"] (round-2 cascade fix) and Section 04 depends_on: ["01", "02", "03"] (round-3 fix), but overview’s prose dependency descriptions still said “Section 03 (Report) depends on 02’s conflict classifications” and “Section 04 (Item Verifier) depends on 01’s schema for frontmatter access” — both omitting the new edges. Estimated Effort table also stale. Resolved: Fixed on 2026-04-14. Updated overview Section Dependency Graph ASCII to show all real edges. Rewrote dependency bullets to reflect: 02 consumes plan_corpus.resolve_dep() from 01; 03 imports Finding/FindingCategory/FindingSubtype from 01 + 02’s classifier output + 01.4 normalizer + OWNS SafeFix taxonomy (relocated 2026-04-14); 04 imports Finding + ITEM_VERIFICATION subtypes from 01 + integrates with 03’s report. Estimated Effort table updated: 03 now "01", "02"; 04 now "01", "02", "03".

  • [TPR-01-003-codex][medium][round 4] plans/verify-roadmap-redesign/section-02-dag-builder.md:130 — Two new edge-only STATUS_CONTRADICTION subtypes (CROSS_EDGE_TEMPORAL_DRIFT, TPR_STALE_VS_EDIT) declared in 01.3 but lacking 01.5 fixtures and 05.2 validation cases. Evidence: Round-3 fix declared the two subtypes in 01.3’s enum and referenced them in 02.2’s classifier description, but no fixtures forced either output, and classify_safety in 03.2 had no documented routing for them (default ExposureReview). Without semantic+negative pins, the DAG classifier could regress silently. Resolved: Fixed on 2026-04-14. Added two 01.5 fixtures: cross_edge_temporal_drift_corpus.json (two-plan fixture; complete plan A depends on in-progress plan B; classifier emits subtype + negative pin where both complete = no finding) and tpr_stale_vs_edit_corpus.json (TPR updated 90+ days older than upstream mtime; classifier emits subtype + negative pin where freshness holds). Cross-link to 03.2 classify_safety default ExposureReview branch for both subtypes. Note: 05.2 test cases (g) and (h) exercise BLOCKED and DEAD_REFERENCE respectively — neither maps naturally to TPR_STALE_VS_EDIT; no 05.2 validation case for this subtype (covered by 01.5 fixtures alone).

  • [TPR-01-001-codex][low][round 4] plans/verify-roadmap-redesign/section-01-frontmatter-schema.md:311 — Residual STALE_METADATA reference in 01.4 narrative (outside the deliberate retirement notes). Evidence: 01.4’s opening paragraph said “in the original 01.2 cross-validation, original 02.2 STALE_METADATA, and original 03.2 auto-fix” — using the retired classifier name in raw form (not as a quoted retirement marker). Reviewer correctly flagged as DRIFT, even though the intent was historical context. Resolved: Fixed on 2026-04-14. Reworded narrative to “the original 02.2 classifier” — preserves the historical reference to round-1 finding 6 (CODEX) without echoing the retired name. The other STALE_METADATA references in section-02:132 (retirement notice) and section-03:125 (retirement notice) remain as deliberate provenance markers; 01.R historical findings (lines 565, 595, 618-628) also retain the name as part of finding evidence. Grep verifies no other live references.

  • [TPR-01-001-codex][high][round 5] and [TPR-01-001-gemini][high][round 5] scripts/plan_corpus/discovery.py:_classify_and_store — LEAK:swallowed-error: CorpusParseError caught and replaced with data = {}, silently absorbing corrupt files into corpus inventory (recreating the superseded planlib.py failure mode). Resolved: Fixed on 2026-04-14. _classify_and_store now converts the caught exception into a Finding(category=PARSE_ERROR, subtype=...) appended to corpus.gaps using the existing _parse_error_to_subtype() mapper (consolidated with my initial duplicate helper to avoid LEAK:algorithmic-duplication). Returns early instead of storing {}. Corrupt files now surface as parse-error findings, never as empty-frontmatter ghosts.

  • [TPR-01-002-codex][high][round 5] scripts/plan_corpus/parser.py + tests/plan-audit/test_plan_corpus.py:83 — Parser did not reject ... YAML document-end marker as a frontmatter closer; test suite did not exercise multi-document rejection within frontmatter region. Resolved: Fixed on 2026-04-14. Parser now rejects ... during end-index scan (only --- closes frontmatter). Added test_multi_document_within_frontmatter_rejected and test_multi_document_separator_within_frontmatter_rejected tests. CRLF normalization remains as the chosen corpus-consistent behavior (plan allowed either normalize OR reject).

  • [TPR-01-003-codex][medium][round 5] scripts/plan_corpus/normalizer.py:_derive_from_body + _derive_from_children — normalize_status ignored body markers (COMPLETE, [done], [todo]) and did not emit derived=queued for active plans with all-not-started sections. Resolved: Fixed on 2026-04-14. _derive_from_body now honors has_complete_marker/has_done_marker/has_todo_marker. normalize_status returns derived=queued when declared=active and all children are not-started (per 01.4 §PLAN_ACTIVE_ALL_SECTIONS_NOT_STARTED contract).

  • [TPR-01-004-codex][medium][round 5] scripts/plan_corpus/schema.py:_OVERVIEW_ALLOWED + _ROADMAP_SECTION_ALLOWED — Allowlist drift: reviewed field missing from overview schema, tpr_findings and verification_summary missing from roadmap section schema (all present in live corpus). Resolved: Fixed on 2026-04-14. Added reviewed to OverviewSchema and tpr_findings/verification_summary to RoadmapSectionSchema (the new dataclass SSOT — see TPR-01-002-gemini resolution). Regenerated docs/internal/plan-schema-reference.md to reflect the canonical fields.

  • [TPR-01-005-codex][medium][round 5] scripts/plan_corpus/schema.pySubsectionEntry dataclass declared but nested sections[] entries never validated; per-file validators only checked top-level sections key. Resolved: Fixed on 2026-04-14. Added _validate_sections() that walks each entry, checks type, required fields (id, title, status), unknown fields, and status enum. Wired into _validate_plan_section and _validate_roadmap_section.

  • [TPR-01-006-codex][low][round 5] and [TPR-01-003-gemini][high][round 5] tests/plan-audit/test_plan_corpus.py — Missing exhaustiveness pin: no test iterating all FindingSubtype members to confirm _CATEGORY_SUBTYPES registration is complete. Resolved: Fixed on 2026-04-14. Added test_all_subtypes_registered_in_category_map (iterates every FindingSubtype member, asserts presence in at least one category frozenset) and test_all_categories_have_subtypes (iterates every FindingCategory, asserts entry in _CATEGORY_SUBTYPES). Future subtype additions now fail loudly until registered.

  • [TPR-01-002-gemini][high][round 5] scripts/plan_corpus/schemas.py (new) — Schemas were implemented via allowlist frozensets + required lists, not as Python @dataclass classes as the plan promised. Loss of typed fields and SSOT introspection. Resolved: Fixed on 2026-04-14. Created scripts/plan_corpus/schemas.py with 7 @dataclass(frozen=True) classes (PlanIndexSchema, PlanSectionSchema, RoadmapSectionSchema, OverviewSchema, BugTrackerSectionSchema, FixBugSchema, CompletedIndexSchema). Added _schema_required_fields() and _schema_allowed_fields() helpers that introspect via dataclasses.fields(). Deleted the 14 redundant module-level constants (_PLAN_INDEX_REQUIRED, _PLAN_INDEX_ALLOWED, etc.). generate_schema_reference() in docgen.py now introspects the dataclasses directly — single SSOT, drift-gate continues to pass.

  • [TPR-01-004-gemini][medium][round 5] tests/plan-audit/test_plan_corpus.py:83test_multi_document used an inline YAML string instead of reading the tests/plan-audit/fixtures/multi_document.md fixture file (orphaned fixture). Resolved: Fixed on 2026-04-14. Renamed to test_multi_document_body_boundary_accepted and switched to reading FIXTURES / "multi_document.md" via .read_text(). Fixture is no longer orphaned.

  • [TPR-01-005-gemini][medium][round 5] tests/plan-audit/test_plan_corpus.py — Test names omitted the <expected> portion of the mandatory <subject>_<scenario>_<expected> naming convention (e.g., test_duplicate_key instead of test_frontmatter_duplicate_key_raises_parse_error). Resolved: Fixed on 2026-04-14. Renamed 39 test functions across 13 test classes to conform to the convention. Tests that already carried an outcome suffix (_rejected, _accepted, _raises_, _returns_, etc.) were left untouched. Test bodies unchanged; 94 tests continue to pass.

  • [TPR-01-006-gemini][medium][round 5] scripts/plan_corpus.py (1,465 lines) — BLOAT: file exceeded the 500-line limit by nearly 3x, mixing CLI implementation, walker logic, strict YAML parsing, schema declaration, status normalization, and docgen in a single monolith. Resolved: Fixed on 2026-04-14. Converted to a Python package scripts/plan_corpus/ with 8 submodules, all under 500 lines: types.py (273), parser.py (184), schema.py (471), schemas.py (130), discovery.py (304), normalizer.py (169), docgen.py (196), __init__.py (164 — re-exports), __main__.py (104 — CLI). Backward compatibility preserved: from scripts.plan_corpus import X continues to work for all previously-public names. CLI invocation changed from python scripts/plan_corpus.py ... to python -m scripts.plan_corpus ...; updated CLAUDE.md §Commands and §Key Paths.

  • [TPR-01-001-codex][medium][round 6] scripts/plan_corpus/schema.py:131 — Nested section validation hardcoded _SECTION_ENTRY_ALLOWED / _SECTION_ENTRY_REQUIRED sets instead of deriving from SubsectionEntry dataclass. Drift vector: schema-field allowlists must live only in the dataclass SSOT. Resolved: Fixed on 2026-04-14. Moved SubsectionEntry from schema.py into schemas.py alongside the 7 file-class schemas. _validate_sections() now uses _schema_required_fields(SubsectionEntry) / _schema_allowed_fields(SubsectionEntry) — same introspection helpers as the top-level schemas. Deleted _SECTION_ENTRY_ALLOWED and _SECTION_ENTRY_REQUIRED constants.

  • [TPR-01-002-codex][medium][round 6] tests/plan-audit/test_plan_corpus.py — Missing regression pins for two round-5 fix paths: _classify_and_store LEAK:swallowed-error resolution (discovery never feeds malformed files through the path), and _validate_sections nested-entry validation (no tests for malformed entries). Resolved: Fixed on 2026-04-14. Added 6 regression pins: test_discover_corpus_malformed_file_surfaces_parse_error_finding (semantic pin via tmp_path), test_discover_corpus_valid_file_produces_no_parse_error (negative pin), plus 4 sections[] entry tests — test_nested_sections_entry_missing_required_field_flagged, test_nested_sections_entry_unknown_field_flagged, test_nested_sections_entry_invalid_status_flagged, test_nested_sections_valid_entry_produces_no_findings. Test count: 94 → 100.

  • [TPR-01-001-gemini][high][round 6] scripts/plan_corpus/discovery.py:141-171 — LEAK:algorithmic-duplication: same 10-line loop used twice consecutively to iterate corpus.indexes.items() and corpus.completed_indexes.items() populating name_index and flagging DUPLICATE_PLAN_NAME. Resolved: Fixed on 2026-04-14. Consolidated into a single loop using itertools.chain(corpus.indexes.items(), corpus.completed_indexes.items()). One SSOT for DUPLICATE_PLAN_NAME detection across both index sources; halves the code size.

  • [TPR-01-002-gemini][low][round 6] tests/plan-audit/test_plan_corpus.py:97 — Misleading test name: test_multi_document_separator_within_frontmatter_rejected asserted data == {"doc1": True} (acceptance), contradicting the _rejected suffix. Resolved: Fixed on 2026-04-14. Renamed to test_multi_document_separator_handled_as_boundary and updated the docstring to explain the distinction between this boundary-detection test and the adjacent test_multi_document_within_frontmatter_rejected (which uses ... within frontmatter to trigger actual rejection).


01.N Completion Checklist

  • scripts/plan_corpus.py exists as the sole SSOT for corpus schema, parsing, discovery, finding types, and status normalization
  • Strict parser rejects all 12 YAML failure classes enumerated in 01.5 fixtures
  • Seven file-class schemas defined (plan index, plan section, roadmap section, overview, bug-tracker-section, fix-BUG, completed-index) and verified against live exemplars (GEMINI-01-001 closure)
  • load_and_validate(path) -> Either[Finding, ValidatedFile] boundary function exists; Sections 02-05 use it exclusively (no direct split_frontmatter_strict calls outside the boundary) — verified by grep (GEMINI-01-003 closure)
  • Two-stage directory classifier implemented: plan candidates vs containers vs unclassified; containers do NOT produce MISSING_INDEX_MD findings (CODEX-01-005 closure)
  • Closed status enum is corpus-derived (includes research at plan level; none | findings | resolved | clean at TPR level) — no invented values
  • depends_on convention standardized on logical IDs (intra-plan "NN", cross-plan "plan-name#NN" resolving via PlanIndexSchema.name); full paths REJECTED; directory slugs in cross-plan IDs REJECTED with did-you-mean hint (GEMINI-01-002 closure)
  • Finding, Severity, FindingCategory, FindingSubtype, Corpus types defined with two-level taxonomy; Sections 02-04 import them (verified by grep — no shadow types, no local ClassifierType-style enum); ITEM_VERIFICATION subtypes enumerated in 01.3 (CODEX-01-004 closure)
  • FixBugSchema cross-field rule matches live corpus: status=complete allows any third_party_review.status value (none/clean/resolved/findings); positive pins exist for all four combos (CODEX-01-002 closure)
  • Canonical status normalizer implemented; emits PLAIN STATUS_CONTRADICTION findings without any safety_class attribute (policy relocated to Section 03 — CODEX-01-003, GEMINI-01-004, GEMINI-01-005 closure)
  • Sections 02 / 03 consume the normalizer (no re-implementation)
  • plan_corpus.py is a pure library — no git queries anywhere in the module; verified by grep -n 'subprocess\|os.system\|git' scripts/plan_corpus.py returning only in load_spec_files / unrelated contexts
  • Fixture corpus covers every YAML failure + every schema violation + every normalizer case + PARSE_ERROR→Finding boundary + roadmap-section shape + fix-BUG cross-field matrix + name-based cross-plan resolution + container-dir exemption + --docgen --check drift + semantic pin (only-strict-mode-passes) + negative pin (must-reject)
  • Pilot migration covers ALL SEVEN schema classes with at least one green artifact each (coverage table produced — CODEX-01-001 closure); sibling sections’ depends_on updated
  • Full-corpus sweep NOT run here (deferred to Section 05.3 — owning section)
  • Satisfies overview test cases (d), (e), (f) at the level of “findings are produced” (validation of full-corpus coverage happens in Section 05.2)
  • scripts/plan_corpus.py --docgen --check is implemented and wired into ./test-all.sh; CI fails if committed docs/internal/plan-schema-reference.md diverges from fresh output (GEMINI-01-006 closure)
  • Plan-sync: 00-overview.md Quick Reference table, schema-owner table (now seven rows), and Mission Success Criteria checkboxes updated to reflect this section’s structural changes
  • Plan-sync: index.md keyword clusters updated for new subsection structure (01.1–01.6) and seven schemas
  • Plan-sync: Section 02/03/04/05 depends_on frontmatter migrated to logical IDs (cascade from 01.3); Section 03.2 owns the migrated SafeFix/ExposureReview taxonomy; Section 04.2 references 01.3’s ITEM_VERIFICATION subtypes (no shadow enum)
  • timeout 150 ./test-all.sh green — no regressions
  • /tpr-review — dual-source review of plan_corpus.py (schema correctness, parser strictness, no LEAK regressions). Completed 2026-04-14 across 3 rounds (12 → 4 → 0 actionable findings). Round 3 clean pass; both reviewers confirmed SSOT compliance, parser strictness, and test coverage.
  • /impl-hygiene-review — verify no drift between dataclass SSOT and any downstream usage; verify no errors="replace" remaining in any plan-parsing code; verify no git queries in plan_corpus.py. Completed 2026-04-14 via /tp-help dual-source cross-check — 10 findings surfaced that the TPR rounds missed: (1) LEAK High discovery.py:251 fabricated FileClass.PLAN_INDEX when classify_file() returned None → fixed by returning LoadResult.err=Finding(GAP, UNCLASSIFIED_DIRECTORY) instead of a fake classification; (2) LEAK:algorithmic-duplication docgen.py:_partition_fields duplicated schemas.py:_schema_required_fields logic → fixed by delegating to the canonical helpers; (3) DRIFT generated markdown header + committed plan-schema-reference.md still claimed generation from pre-split scripts/plan_corpus.py → updated to reference scripts/plan_corpus/schemas.py; (4-6) SSOT violationsTprInfo dataclass + 7 status-enum frozensets (PLAN_STATUSES, etc.) lived in schema.py but represent schema shape/constraint knowledge → moved to schemas.py (the dataclass SSOT); (7) Dead public surface SchemaField/SchemaReference stubbed in docgen.py + re-exported but no consumer → deleted; (8) Import indirection __init__.py routed schema-class imports through .schema (re-export) instead of origin .schemas → switched to direct .schemas imports; (9) _CATEGORY_SUBTYPES underscore-prefixed internal was listed in __init__.py __all__ → removed from __all__ (still importable); (10) FileClass metadata scattered across discovery.py/schema.py/docgen.py (corpus bucket map + validator map + docgen tuple list) → consolidated into new FileClassMeta + FILE_CLASS_META registry in schema.py consumed by validate() and docgen.generate_schema_reference(). All 100 tests pass; docgen --check green.
  • /improve-tooling section-close sweep — verify per-subsection retrospectives ran; add cross-subsection findings. Completed 2026-04-14. Retrospective audit: all 6 subsection retrospectives (01.1-01.6) ran and documented “no gaps” or “TDD worked smoothly”. Documentation surfaces: CLAUDE.md §Commands and §Key Paths updated to reflect the package split (python -m scripts.plan_corpus + scripts/plan_corpus/ directory). plan_corpus is plan-audit tooling, not a diagnostic script, so diagnostic.md does not need an entry. Cross-cutting patterns (this session’s section-close work): this session ran 3 TPR rounds + impl-hygiene-review + tp-help + 5 commits. No friction with existing tools — roadmap-scan.sh, plan-annotations.sh, /tpr-review (polling infrastructure + 75s cadence worked cleanly across 3 rounds), /tp-help (caught 10 SSOT-level findings the 3 TPR rounds missed), test-all.sh, and /commit-push all performed without gaps. One observation (NOT actionable here): hygiene-lint.py is Rust-only — it returned "No .rs files found in scope" when pointed at scripts/plan_corpus/. Python tooling review is a rare path in this codebase; extending the lint tool to Python would be a tool-change decision outside Section 01’s scope. Sweep outcome: no cross-cutting improvements required.
  • /sync-claude section-close sweep — verify CLAUDE.md §Commands / §Key Paths reflect scripts/plan_corpus.py; verify .claude/rules/impl-hygiene.md cross-links the new SSOT. Completed 2026-04-14. CLAUDE.md: §Commands updated to python -m scripts.plan_corpus check <path> (package invocation) and §Key Paths lists scripts/plan_corpus/ as plan schema SSOT package with submodule enumeration. impl-hygiene.md: Added scripts/plan_corpus/ to the SSOT table as a canonical home (schemas.py dataclass SSOTs + parser.py strict YAML + FILE_CLASS_META registry) — from scripts.plan_corpus import ... is the canonical consumer path, no shadow parsers or schema definitions elsewhere. Superseded-by refs: .claude/skills/plan-audit/planlib.py:250,259 TODO comments updated to reference scripts/plan_corpus/parser.py (package form) instead of the pre-split scripts/plan_corpus.py. Historical refs in plan files: references in section-01-frontmatter-schema.md body (success criteria, file headers, TPR finding text) are historically accurate — they describe what was delivered before the Round 5 split and are correctly reflected by the 01.6 completion note that documents the package conversion.