0%

Section 14: llvm-verification-tooling Teardown + AIMS-Foundation Absorption + test-all Fold

Goal: Consolidate verification on the burden-specific suite (§13), strip the generic copy-paste verification of plans/llvm-verification-tooling, and fold the fast burden conformance into test-all.sh. The generic tooling (FileCheck IR-pattern tests, ASan/UBSan, Alive2 translation validation) has ZERO burden-tracking awareness — it verifies the predicate-stack-era LLVM output, not the burden path. Its standalone CI (nightly-verification.yml) is the pointless surface. The genuinely-AIMS-specific foundations (snapshots, lattice props, contract oracle, protocol matrix) already ship + run in test-all and are absorbed by reference, not rewritten.

Context: plans/llvm-verification-tooling (13 sections) splits into two classes (per the 2026-06-03 reality inventory):

  • GENERIC (strip) — §07 FileCheck (LLVM IR patterns, no burden awareness), §08 sanitizers (ASan/UBSan on generated binaries), §09 Alive2 (LLVM SMT translation validation). These created nightly-verification.yml + 5 scripts + 2 test corpora.
  • AIMS-SPECIFIC (absorb by reference) — §01 ARC-verify-blocking (run_verify/run_aims_verify Result-propagation, the foundation the burden VF-1 builds on), §03 AIMS pass-level snapshots, §04 lattice property tests, §05 contract oracle (the VF-3 oracle §13.5 extends burden-aware), §06 protocol-builtin matrix. All ship + run in test-all transitively.
  • NOT STARTED (cut) — §10 differential fuzzing, §11 CI, §12 dashboard, §13 AI-gen. The residual-RC-op metric + eval↔LLVM differential-oracle concept noted in HISTORY as future-candidate; not built.

The real burden verification (burden_balance.rs VF-1, burden_delta.rs, the burden-aware oracle) was built by THIS plan (§07A/§13), not by llvm-verif — confirming the consolidation direction.

Why a section: the teardown + foundation-absorption + test-all-fold is plan-content (it defines the verification surface this plan’s terminal green-gate runs against) and is independently-workable (its only dependency is §13’s conformance scripts existing). The supersession of plans/llvm-verification-tooling is a cross-plan state change recorded here per state-discipline.md §5.

Depends on: §13 (the check-burden-conformance.sh cross-walk must exist before 14.4 wires it into test-all). FEEDS §10 (the terminal ./test-all.sh green-gate runs the wired burden verification).

Reference inventory (2026-06-03): nightly-verification.yml (the deletion target); ci.yml:15-16 (the ORI_VERIFY_ARC/ORI_VERIFY_EACH env vars to KEEP); aims-proofs.yml (KEEP — load-bearing proof CI); test-all.sh (the fold target — burden VF-1 already active via ORI_VERIFY_ARC=1 export; check-debug-flags.sh preamble is the slot pattern); aims_snapshots.rs / prop_tests.rs / oracle.rs / builtins/tests.rs (the absorb-by-reference foundations).


Intelligence Reconnaissance

Queries:

  • scripts/intel-query.sh --human file-symbols "compiler_repo/scripts" --repo ori — sanitizer/alive2 script surface to inspect before deletion
  • scripts/intel-query.sh --human plan-status llvm-verification-tooling — current section statuses for the supersession record
  • scripts/intel-query.sh --human callers verify_burden_balance --repo ori — confirm the VF-1 gate’s test-all invocation path

Queried: 2026-06-03.

Results summary (≤500 chars) [ori]: the AIMS-specific foundations (snapshots/lattice/oracle/protocol) are in-workspace and run via cargo test --workspace + cargo test -p ori_llvm --test aot — already in test-all, no rewrite. The generic surface (sanitizer/alive2) is CI-only (nightly-verification.yml), not in test-all. The burden VF-1 (verify_burden_balance at postprocess.rs) fires under ORI_VERIFY_ARC=1 which test-all exports — already active; this section names it as a gate + folds the §13 grep cross-walk.


14.1 Delete nightly-verification.yml

  • git rm compiler_repo/.github/workflows/nightly-verification.yml (the entire workflow: sanitizer-smoke + sanitizer-full 4-shard + alive2-build + alive2-curated + alive2-full-sweep). Per CLAUDE.md §Git Safety: git rm only — never git checkout/restore/reset.
  • KEEP ci.yml:15-16 (ORI_VERIFY_ARC: "1" + ORI_VERIFY_EACH: "1" — the only ci.yml changes llvm-verif §01 made; load-bearing for the burden VF-1 gate). Confirm they remain.
  • KEEP aims-proofs.yml (full proof corpus + Lean dual-discharge — load-bearing; cannot fold into test-all’s 150s budget per the CI-disposition decision).
  • Confirm no other workflow references nightly-verification.yml (grep -rl nightly-verification compiler_repo/.github/).

Subsection close-out (14.1) per protocol.


14.2 Verify-before-delete the generic sanitizer/Alive2 scripts + corpora

  • Inspect each target before removal (per the output-style look-before-deleting rule): scripts/sanitizer-smoke.sh, scripts/sanitizer-full.sh, scripts/build-rt-asan.sh, scripts/build-alive2.sh, diagnostics/alive2-verify.sh, tests/sanitizer/, tests/alive2/. Confirm each is GENERIC (no AIMS-burden-specific assertion — sanitizers check generated-binary memory errors; Alive2 checks LLVM-pass refinement; neither asserts burden balance or a proven AIMS antecedent).
  • For any target that the inspection finds carries burden-specific value: RETAIN it (surface the finding in this subsection’s body; do NOT delete). The DEFAULT (user-approved) is deletion; this checkpoint catches a target that contradicts the generic classification.
  • git rm the confirmed-generic targets.
  • Honest-acknowledgment (record in body): deleting Alive2 drops generic LLVM-miscompile detection; deleting sanitizers drops generated-binary ASan/UBSan coverage. Both required Clang/nightly-Rust and did NOT run in test-all.sh (CI-only). The AIMS-relevant memory-safety coverage that REMAINS in test-all: burden VF-1 balance (catches RC imbalance at the source), ORI_CHECK_LEAKS=1 zero-leak gate (catches leaks at runtime), dual-execution parity (catches eval↔LLVM divergence). This is a deliberate scope decision, not an oversight.

Subsection close-out (14.2) per protocol.


14.3 Record AIMS-foundation absorption (by reference)

  • Confirm the AIMS-specific foundations ship + run in test-all transitively (no rewrite): §03 snapshots (compiler/oric/tests/aims_snapshots.rs + aims-snapshots/ — via cargo test --workspace), §04 lattice props (ori_arc/src/aims/lattice/prop_tests.rs — via cargo test -p ori_arc), §05 contract oracle (ori_arc/src/aims/verify/oracle.rs — via cargo test -p ori_arc; the VF-3 oracle §13.5 extends burden-aware), §06 protocol matrix (ori_arc/src/aims/builtins/tests.rs + ori_ir/builtin_constants/protocol/tests.rs).
  • Record the absorption table in body (foundation | file | test-all invocation | §13/§10/§11 consumer). §03 snapshots → §11 rebake; §05 oracle → §13.5 burden-aware extension; §04 lattice props → §13 proven_by binding (lattice-axiom proofs); §01 ARC-verify-blocking → §10.1 VF-1 foundation.
  • Banned: rewriting any foundation (they ship + pass); the absorption is a reference + ownership-transfer, not a re-implementation.

Subsection close-out (14.3) per protocol.


14.4 Wire run_aims_burden_verification into test-all.sh + mark llvm-verif superseded

  • Add a run_aims_burden_verification step to compiler_repo/test-all.sh preamble (alongside check-debug-flags.sh): runs the §13 aims-proof/scripts/check-burden-conformance.sh 3-verdict cross-walk (fast grep, sub-second, no cargo build); divergent → fail test-all, divergent-pending → pass.
  • Confirm the burden-balance VF-1 corpus gate surfaces as a named gate: it already fires via the ORI_VERIFY_ARC=1 export (per arc.md §Debugging + §10.1); surface its count (diagnostics/burden-balance.sh distinct-imbalance count) as a threshold gate — count-based until §09/§10 drive it to zero, then zero-based per §10.1.
  • Banned: adding the full proof corpus (check-proofs.sh, ~30min) or dual-discharge.sh (Lean lake build) to test-all.sh — they exceed the 150s test budget and stay in aims-proofs.yml + lefthook affected-set (per the CI-disposition decision + script-first test-timeout discipline).
  • Mark plans/llvm-verification-tooling superseded (state-discipline.md §5): edit its index.md + 00-overview.md — status → superseded, HISTORY pointer to plans/aims-burden-tracking/section-14; record §01/§03-§06 absorbed-by-reference, §07-§09 stripped, §10-13 cut. Note the residual-RC-op metric + eval↔LLVM differential-oracle CONCEPT as future-candidate burden work (not built).
  • TPR checkpoint/tpr-review covering 14.1-14.4.

Subsection close-out (14.4) per protocol.


14.R Third Party Review Findings

  • None.

14.N Completion Checklist

  • nightly-verification.yml deleted; ci.yml env vars + aims-proofs.yml kept; no dangling reference (14.1).
  • Generic sanitizer/Alive2 scripts + corpora removed after per-target verify-before-delete; honest-acknowledgment of dropped coverage recorded; any burden-valued target retained (14.2).
  • AIMS-foundation absorption table recorded; foundations confirmed shipped + in test-all; zero rewrites (14.3).
  • run_aims_burden_verification wired into test-all preamble; VF-1 corpus gate named; full proof corpus stays in CI; plans/llvm-verification-tooling marked superseded with HISTORY pointer (14.4).
  • Plan annotation cleanup.
  • Plan sync per protocol.
  • /tpr-review passed (final); /impl-hygiene-review passed.

HISTORY

  • 2026-06-03 — §14 created via /create-plan --inline (routing.md §3 case b) — llvm-verification-tooling teardown + AIMS-foundation absorption + test-all fold. User directive (2026-06-03): strip the generic copy-paste verification of plans/llvm-verification-tooling and consolidate on the burden-specific suite §13 builds, dropping the pointless standalone CI and folding fast checks into test-all. Topology (user-approved): §14 depends_on §13, FEEDS §10. Three moves: (1) delete nightly-verification.yml (sanitizers+Alive2 CI) + its scripts/corpora after verify-before-delete; KEEP aims-proofs.yml (the slow proof corpus cannot fit test-all’s 150s budget) + the ci.yml ORI_VERIFY_ARC/ORI_VERIFY_EACH env vars; (2) absorb the AIMS-specific foundations (§03 snapshots / §04 lattice props / §05 contract oracle / §06 protocol matrix) BY REFERENCE — they ship + run in test-all transitively, no rewrite, the §13 suite + §11 rebake consume them; (3) wire run_aims_burden_verification into test-all preamble + mark plans/llvm-verification-tooling superseded, cutting its not-started §10-13 (differential fuzzing/CI/dashboard/AI-gen) while preserving the residual-RC-op-metric + differential-oracle CONCEPT as future-candidate burden work in HISTORY only. Honest-acknowledgment: generic Alive2 miscompile + ASan/UBSan binary-memory-error coverage is dropped (both were CI-only, required Clang/nightly); the AIMS-relevant coverage retained in test-all is burden VF-1 + ORI_CHECK_LEAKS=1 + dual-execution parity.