Section 14: llvm-verification-tooling Teardown + AIMS-Foundation Absorption + test-all Fold
Goal: Consolidate verification on the burden-specific suite (§13), strip the generic copy-paste verification of plans/llvm-verification-tooling, and fold the fast burden conformance into test-all.sh. The generic tooling (FileCheck IR-pattern tests, ASan/UBSan, Alive2 translation validation) has ZERO burden-tracking awareness — it verifies the predicate-stack-era LLVM output, not the burden path. Its standalone CI (nightly-verification.yml) is the pointless surface. The genuinely-AIMS-specific foundations (snapshots, lattice props, contract oracle, protocol matrix) already ship + run in test-all and are absorbed by reference, not rewritten.
Context: plans/llvm-verification-tooling (13 sections) splits into two classes (per the 2026-06-03 reality inventory):
- GENERIC (strip) — §07 FileCheck (LLVM IR patterns, no burden awareness), §08 sanitizers (ASan/UBSan on generated binaries), §09 Alive2 (LLVM SMT translation validation). These created
nightly-verification.yml+ 5 scripts + 2 test corpora. - AIMS-SPECIFIC (absorb by reference) — §01 ARC-verify-blocking (
run_verify/run_aims_verifyResult-propagation, the foundation the burden VF-1 builds on), §03 AIMS pass-level snapshots, §04 lattice property tests, §05 contract oracle (the VF-3 oracle §13.5 extends burden-aware), §06 protocol-builtin matrix. All ship + run in test-all transitively. - NOT STARTED (cut) — §10 differential fuzzing, §11 CI, §12 dashboard, §13 AI-gen. The residual-RC-op metric + eval↔LLVM differential-oracle concept noted in HISTORY as future-candidate; not built.
The real burden verification (burden_balance.rs VF-1, burden_delta.rs, the burden-aware oracle) was built by THIS plan (§07A/§13), not by llvm-verif — confirming the consolidation direction.
Why a section: the teardown + foundation-absorption + test-all-fold is plan-content (it defines the verification surface this plan’s terminal green-gate runs against) and is independently-workable (its only dependency is §13’s conformance scripts existing). The supersession of plans/llvm-verification-tooling is a cross-plan state change recorded here per state-discipline.md §5.
Depends on: §13 (the check-burden-conformance.sh cross-walk must exist before 14.4 wires it into test-all). FEEDS §10 (the terminal ./test-all.sh green-gate runs the wired burden verification).
Reference inventory (2026-06-03): nightly-verification.yml (the deletion target); ci.yml:15-16 (the ORI_VERIFY_ARC/ORI_VERIFY_EACH env vars to KEEP); aims-proofs.yml (KEEP — load-bearing proof CI); test-all.sh (the fold target — burden VF-1 already active via ORI_VERIFY_ARC=1 export; check-debug-flags.sh preamble is the slot pattern); aims_snapshots.rs / prop_tests.rs / oracle.rs / builtins/tests.rs (the absorb-by-reference foundations).
Intelligence Reconnaissance
Queries:
scripts/intel-query.sh --human file-symbols "compiler_repo/scripts" --repo ori— sanitizer/alive2 script surface to inspect before deletionscripts/intel-query.sh --human plan-status llvm-verification-tooling— current section statuses for the supersession recordscripts/intel-query.sh --human callers verify_burden_balance --repo ori— confirm the VF-1 gate’s test-all invocation path
Queried: 2026-06-03.
Results summary (≤500 chars) [ori]: the AIMS-specific foundations (snapshots/lattice/oracle/protocol) are in-workspace and run via cargo test --workspace + cargo test -p ori_llvm --test aot — already in test-all, no rewrite. The generic surface (sanitizer/alive2) is CI-only (nightly-verification.yml), not in test-all. The burden VF-1 (verify_burden_balance at postprocess.rs) fires under ORI_VERIFY_ARC=1 which test-all exports — already active; this section names it as a gate + folds the §13 grep cross-walk.
14.1 Delete nightly-verification.yml
-
git rm compiler_repo/.github/workflows/nightly-verification.yml(the entire workflow: sanitizer-smoke + sanitizer-full 4-shard + alive2-build + alive2-curated + alive2-full-sweep). Per CLAUDE.md §Git Safety:git rmonly — nevergit checkout/restore/reset. - KEEP
ci.yml:15-16(ORI_VERIFY_ARC: "1"+ORI_VERIFY_EACH: "1"— the only ci.yml changes llvm-verif §01 made; load-bearing for the burden VF-1 gate). Confirm they remain. - KEEP
aims-proofs.yml(full proof corpus + Lean dual-discharge — load-bearing; cannot fold into test-all’s 150s budget per the CI-disposition decision). - Confirm no other workflow references
nightly-verification.yml(grep -rl nightly-verification compiler_repo/.github/).
Subsection close-out (14.1) per protocol.
14.2 Verify-before-delete the generic sanitizer/Alive2 scripts + corpora
- Inspect each target before removal (per the output-style look-before-deleting rule):
scripts/sanitizer-smoke.sh,scripts/sanitizer-full.sh,scripts/build-rt-asan.sh,scripts/build-alive2.sh,diagnostics/alive2-verify.sh,tests/sanitizer/,tests/alive2/. Confirm each is GENERIC (no AIMS-burden-specific assertion — sanitizers check generated-binary memory errors; Alive2 checks LLVM-pass refinement; neither asserts burden balance or a proven AIMS antecedent). - For any target that the inspection finds carries burden-specific value: RETAIN it (surface the finding in this subsection’s body; do NOT delete). The DEFAULT (user-approved) is deletion; this checkpoint catches a target that contradicts the generic classification.
-
git rmthe confirmed-generic targets. - Honest-acknowledgment (record in body): deleting Alive2 drops generic LLVM-miscompile detection; deleting sanitizers drops generated-binary ASan/UBSan coverage. Both required Clang/nightly-Rust and did NOT run in
test-all.sh(CI-only). The AIMS-relevant memory-safety coverage that REMAINS in test-all: burden VF-1 balance (catches RC imbalance at the source),ORI_CHECK_LEAKS=1zero-leak gate (catches leaks at runtime), dual-execution parity (catches eval↔LLVM divergence). This is a deliberate scope decision, not an oversight.
Subsection close-out (14.2) per protocol.
14.3 Record AIMS-foundation absorption (by reference)
- Confirm the AIMS-specific foundations ship + run in test-all transitively (no rewrite): §03 snapshots (
compiler/oric/tests/aims_snapshots.rs+aims-snapshots/— viacargo test --workspace), §04 lattice props (ori_arc/src/aims/lattice/prop_tests.rs— viacargo test -p ori_arc), §05 contract oracle (ori_arc/src/aims/verify/oracle.rs— viacargo test -p ori_arc; the VF-3 oracle §13.5 extends burden-aware), §06 protocol matrix (ori_arc/src/aims/builtins/tests.rs+ori_ir/builtin_constants/protocol/tests.rs). - Record the absorption table in body (foundation | file | test-all invocation | §13/§10/§11 consumer). §03 snapshots → §11 rebake; §05 oracle → §13.5 burden-aware extension; §04 lattice props → §13 proven_by binding (lattice-axiom proofs); §01 ARC-verify-blocking → §10.1 VF-1 foundation.
- Banned: rewriting any foundation (they ship + pass); the absorption is a reference + ownership-transfer, not a re-implementation.
Subsection close-out (14.3) per protocol.
14.4 Wire run_aims_burden_verification into test-all.sh + mark llvm-verif superseded
- Add a
run_aims_burden_verificationstep tocompiler_repo/test-all.shpreamble (alongsidecheck-debug-flags.sh): runs the §13aims-proof/scripts/check-burden-conformance.sh3-verdict cross-walk (fast grep, sub-second, no cargo build);divergent→ fail test-all,divergent-pending→ pass. - Confirm the burden-balance VF-1 corpus gate surfaces as a named gate: it already fires via the
ORI_VERIFY_ARC=1export (perarc.md §Debugging+ §10.1); surface its count (diagnostics/burden-balance.shdistinct-imbalance count) as a threshold gate — count-based until §09/§10 drive it to zero, then zero-based per §10.1. - Banned: adding the full proof corpus (
check-proofs.sh, ~30min) ordual-discharge.sh(Leanlake build) totest-all.sh— they exceed the 150s test budget and stay inaims-proofs.yml+ lefthook affected-set (per the CI-disposition decision +script-firsttest-timeout discipline). - Mark
plans/llvm-verification-toolingsuperseded (state-discipline.md §5): edit itsindex.md+00-overview.md— status →superseded, HISTORY pointer toplans/aims-burden-tracking/section-14; record §01/§03-§06 absorbed-by-reference, §07-§09 stripped, §10-13 cut. Note the residual-RC-op metric + eval↔LLVM differential-oracle CONCEPT as future-candidate burden work (not built). - TPR checkpoint —
/tpr-reviewcovering 14.1-14.4.
Subsection close-out (14.4) per protocol.
14.R Third Party Review Findings
- None.
14.N Completion Checklist
-
nightly-verification.ymldeleted;ci.ymlenv vars +aims-proofs.ymlkept; no dangling reference (14.1). - Generic sanitizer/Alive2 scripts + corpora removed after per-target verify-before-delete; honest-acknowledgment of dropped coverage recorded; any burden-valued target retained (14.2).
- AIMS-foundation absorption table recorded; foundations confirmed shipped + in test-all; zero rewrites (14.3).
-
run_aims_burden_verificationwired into test-all preamble; VF-1 corpus gate named; full proof corpus stays in CI;plans/llvm-verification-toolingmarked superseded with HISTORY pointer (14.4). - Plan annotation cleanup.
- Plan sync per protocol.
-
/tpr-reviewpassed (final);/impl-hygiene-reviewpassed.
HISTORY
- 2026-06-03 — §14 created via
/create-plan --inline(routing.md §3 case b) — llvm-verification-tooling teardown + AIMS-foundation absorption + test-all fold. User directive (2026-06-03): strip the generic copy-paste verification ofplans/llvm-verification-toolingand consolidate on the burden-specific suite §13 builds, dropping the pointless standalone CI and folding fast checks into test-all. Topology (user-approved): §14depends_on §13, FEEDS §10. Three moves: (1) deletenightly-verification.yml(sanitizers+Alive2 CI) + its scripts/corpora after verify-before-delete; KEEPaims-proofs.yml(the slow proof corpus cannot fit test-all’s 150s budget) + theci.ymlORI_VERIFY_ARC/ORI_VERIFY_EACHenv vars; (2) absorb the AIMS-specific foundations (§03 snapshots / §04 lattice props / §05 contract oracle / §06 protocol matrix) BY REFERENCE — they ship + run in test-all transitively, no rewrite, the §13 suite + §11 rebake consume them; (3) wirerun_aims_burden_verificationinto test-all preamble + markplans/llvm-verification-toolingsuperseded, cutting its not-started §10-13 (differential fuzzing/CI/dashboard/AI-gen) while preserving the residual-RC-op-metric + differential-oracle CONCEPT as future-candidate burden work in HISTORY only. Honest-acknowledgment: generic Alive2 miscompile + ASan/UBSan binary-memory-error coverage is dropped (both were CI-only, required Clang/nightly); the AIMS-relevant coverage retained in test-all is burden VF-1 +ORI_CHECK_LEAKS=1+ dual-execution parity.