0%

Section 06: Test Matrix + Dual-Execution Parity

Status: Not Started Goal: Integrated verification across the four facets. Per CLAUDE.md §Fix Completeness: matrix tests + semantic pins + debug+release parity + interpreter+LLVM parity + zero leaks. This section is the final gate before close-out.

Context: Per-section tests (§02.3, §03.3, §04.3, §05.3) cover each facet in isolation. This section adds:

  1. Cross-facet interaction tests (catch bugs that only surface when two facets combine).
  2. Sweep verification (dual-exec, ORI_CHECK_LEAKS, debug+release parity across the full new-test set).
  3. Whole-suite regression check.

The four facets together change the mono-collection dispatch path significantly; integration testing catches interaction bugs per-section tests miss.

Reference implementations:

  • Rust rustc_mir_transform test corpus uses cross-pass test fixtures: a single source file exercises multiple optimizations simultaneously.
  • Swift swift/test/SILOptimizer/ includes “diagonal” tests covering interactions between SIL passes.

Depends on: §02, §03, §04, §05 (all four facets must have shipped tests before integration testing makes sense).

Intelligence Reconnaissance

Queries planned:

  • scripts/intel-query.sh --human file-symbols "compiler/ori_llvm/tests/aot" --repo ori — full inventory of AOT test corpus to identify potential cross-facet test surfaces.
  • scripts/intel-query.sh --human file-symbols "compiler_repo/diagnostics" --repo ori — verify dual-exec-verify.sh + ORI_CHECK_LEAKS infrastructure surface.

Results summary (≤500 chars, recorded 2026-05-14) [ori]: TO BE POPULATED at section start by running the queries above. Scaffold authored 2026-05-14 — queries deferred to execution time when the four facet sections’ test surfaces are landed.


06.1 Cross-facet interaction tests

File(s): compiler_repo/compiler/ori_llvm/tests/aot/mono_cross_facet.rs (new file)

Each test exercises ≥2 of the four facets simultaneously:

InteractionTests
§02 (import) ∩ §03 (inherent method on generic)test_imported_inherent_method_box_intimpl<T> Box<T> defined in module A, called as Box<int>.unwrap() from module B
§02 (import) ∩ §04 (complex generic arg)test_imported_method_on_option_listOption<T>::map defined in module A, called as Option<[int]>.map(...) from module B
§02 (import) ∩ §05 (builtin Apply)test_imported_generic_with_cast — generic function in module A uses xs.len() as int internally, called from module B
§03 ∩ §04test_inherent_method_on_complex_genericimpl<T> Wrapper<T> on Wrapper<[int]>
§03 ∩ §05test_inherent_method_using_castimpl<T> Box<T> { @len_as_int (self) -> int = self.inner.len() as int }
§04 ∩ §05test_option_complex_arg_with_castOption<[int]>.unwrap().len() as int
§02 ∩ §03 ∩ §04test_imported_inherent_method_on_complex_generic — full diagonal
All four facetstest_full_diagonal_imported_inherent_complex_cast
  • Author all 8 cross-facet tests.
  • Each test pins both Ok behavior AND a negative pin (would fail if any of the cited sections’ fix were reverted).
  • Subsection close-out (06.1) — MANDATORY before §06.2:
    • All 8 cross-facet tests pass.
    • Run /improve-tooling retrospectively on §06.1.
    • Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.2 Dual-execution parity sweep

File(s): N/A (read-only verification)

Per CLAUDE.md §Fix Completeness: every new or modified test MUST have interpreter+LLVM parity.

  • Enumerate every test added by §02.4 (4 + 1 negative), §03.3 (6 + 1), §04.3 (9 + 1), §05.3 (7 + 1), §06.1 (8). Total = 38 tests.
  • Run timeout 150 diagnostics/dual-exec-verify.sh --json | jq '.per_test[] | select(.parity_status != "match")'. Result MUST be empty.
  • If any test reports parity divergence: STOP. The divergence is a fix-completeness bug per CLAUDE.md §Fix Completeness. Pull into scope, fix at the affected backend (eval or LLVM), re-run.
  • Subsection close-out (06.2) — MANDATORY before §06.3:
    • dual-exec-verify.sh empty divergence list.
    • If any divergences surfaced and got fixed: HISTORY block in this section’s body documents the divergence + cure + commit SHA.
    • Run /improve-tooling retrospectively on §06.2.
    • Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.3 Leak verification under ORI_CHECK_LEAKS

File(s): N/A (read-only verification)

  • Run each new test with ORI_CHECK_LEAKS=1 set:
    ORI_CHECK_LEAKS=1 cargo test --release -p ori_llvm --test aot --no-fail-fast
  • Parse the output for LEAK detected markers; MUST be zero.
  • If any leak surfaces: STOP. Pull into scope, fix the leaking allocation site, re-run. New mono-instance registration paths could create new alloc/dealloc imbalances if §02/§03/§04/§05’s emission changes mishandle RC.
  • Subsection close-out (06.3) — MANDATORY before §06.4:
    • Zero leaks reported.
    • Run /improve-tooling retrospectively on §06.3.
    • Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.4 Whole-suite regression check

File(s): N/A (read-only verification)

  • Run timeout 150 ./test-all.sh — whole-suite green.
  • Run timeout 150 ./clippy-all.sh — zero new warnings.
  • Run timeout 150 cargo st — spec tests green.
  • Verify no #[ignore] annotations introduced anywhere in compiler_repo/compiler/ori_llvm/tests/aot/ by this plan (the test_generic_method_on_generic_type un-ignore from §03.2 should be the only delta).
  • Verify no #skip annotations introduced in compiler_repo/tests/spec/ related to this plan’s work.
  • state.sh refresh --dispositions-only && state.sh show --json | jq '.test_dispositions.totals.untracked' returns 0.
  • Subsection close-out (06.4)status: complete:
    • All checks above pass.
    • Run /improve-tooling retrospectively on §06.4.
    • Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.R Third Party Review Findings

Populated by /tpr-review at §06.N (this section’s TPR catches integration bugs across §02–§05).


06.N Completion Checklist

  • All 06.1, 06.2, 06.3, 06.4 subsections status: complete.
  • /tpr-review clean across reviewer set.
  • /impl-hygiene-review clean after TPR.
  • python -m scripts.plan_corpus check plans/aot-mono-completeness/section-06-test-matrix.md exit 0.
  • Section frontmatter flipped to status: complete, reviewed: true.