Section 06: Test Matrix + Dual-Execution Parity

Status: Not Started Goal: Integrated verification across the four facets. Per CLAUDE.md §Fix Completeness: matrix tests + semantic pins + debug+release parity + interpreter+LLVM parity + zero leaks. This section is the final gate before close-out.

Context: Per-section tests (§02.3, §03.3, §04.3, §05.3) cover each facet in isolation. This section adds:

Cross-facet interaction tests (catch bugs that only surface when two facets combine).
Sweep verification (dual-exec, ORI_CHECK_LEAKS, debug+release parity across the full new-test set).
Whole-suite regression check.

The four facets together change the mono-collection dispatch path significantly; integration testing catches interaction bugs per-section tests miss.

Reference implementations:

Rust rustc_mir_transform test corpus uses cross-pass test fixtures: a single source file exercises multiple optimizations simultaneously.
Swift swift/test/SILOptimizer/ includes “diagonal” tests covering interactions between SIL passes.

Depends on: §02, §03, §04, §05 (all four facets must have shipped tests before integration testing makes sense).

Intelligence Reconnaissance

Queries planned:

scripts/intel-query.sh --human file-symbols "compiler/ori_llvm/tests/aot" --repo ori — full inventory of AOT test corpus to identify potential cross-facet test surfaces.
scripts/intel-query.sh --human file-symbols "compiler_repo/diagnostics" --repo ori — verify dual-exec-verify.sh + ORI_CHECK_LEAKS infrastructure surface.

Results summary (≤500 chars, recorded 2026-05-14) [ori]: TO BE POPULATED at section start by running the queries above. Scaffold authored 2026-05-14 — queries deferred to execution time when the four facet sections’ test surfaces are landed.

File(s): compiler_repo/compiler/ori_llvm/tests/aot/mono_cross_facet.rs (new file)

Each test exercises ≥2 of the four facets simultaneously:

Interaction	Tests
§02 (import) ∩ §03 (inherent method on generic)	`test_imported_inherent_method_box_int` — `impl<T> Box<T>` defined in module A, called as `Box<int>.unwrap()` from module B
§02 (import) ∩ §04 (complex generic arg)	`test_imported_method_on_option_list` — `Option<T>::map` defined in module A, called as `Option<[int]>.map(...)` from module B
§02 (import) ∩ §05 (builtin Apply)	`test_imported_generic_with_cast` — generic function in module A uses `xs.len() as int` internally, called from module B
§03 ∩ §04	`test_inherent_method_on_complex_generic` — `impl<T> Wrapper<T>` on `Wrapper<[int]>`
§03 ∩ §05	`test_inherent_method_using_cast` — `impl<T> Box<T> { @len_as_int (self) -> int = self.inner.len() as int }`
§04 ∩ §05	`test_option_complex_arg_with_cast` — `Option<[int]>.unwrap().len() as int`
§02 ∩ §03 ∩ §04	`test_imported_inherent_method_on_complex_generic` — full diagonal
All four facets	`test_full_diagonal_imported_inherent_complex_cast`

Author all 8 cross-facet tests.
Each test pins both Ok behavior AND a negative pin (would fail if any of the cited sections’ fix were reverted).
Subsection close-out (06.1) — MANDATORY before §06.2:
- All 8 cross-facet tests pass.
- Run /improve-tooling retrospectively on §06.1.
- Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.2 Dual-execution parity sweep

File(s): N/A (read-only verification)

Per CLAUDE.md §Fix Completeness: every new or modified test MUST have interpreter+LLVM parity.

Enumerate every test added by §02.4 (4 + 1 negative), §03.3 (6 + 1), §04.3 (9 + 1), §05.3 (7 + 1), §06.1 (8). Total = 38 tests.
Run timeout 150 diagnostics/dual-exec-verify.sh --json | jq '.per_test[] | select(.parity_status != "match")'. Result MUST be empty.
If any test reports parity divergence: STOP. The divergence is a fix-completeness bug per CLAUDE.md §Fix Completeness. Pull into scope, fix at the affected backend (eval or LLVM), re-run.
Subsection close-out (06.2) — MANDATORY before §06.3:
- dual-exec-verify.sh empty divergence list.
- If any divergences surfaced and got fixed: HISTORY block in this section’s body documents the divergence + cure + commit SHA.
- Run /improve-tooling retrospectively on §06.2.
- Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.3 Leak verification under ORI_CHECK_LEAKS

File(s): N/A (read-only verification)

Run each new test with ORI_CHECK_LEAKS=1 set:

ORI_CHECK_LEAKS=1 cargo test --release -p ori_llvm --test aot --no-fail-fast

Parse the output for LEAK detected markers; MUST be zero.
If any leak surfaces: STOP. Pull into scope, fix the leaking allocation site, re-run. New mono-instance registration paths could create new alloc/dealloc imbalances if §02/§03/§04/§05’s emission changes mishandle RC.
Subsection close-out (06.3) — MANDATORY before §06.4:
- Zero leaks reported.
- Run /improve-tooling retrospectively on §06.3.
- Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.4 Whole-suite regression check

File(s): N/A (read-only verification)

Run timeout 150 ./test-all.sh — whole-suite green.
Run timeout 150 ./clippy-all.sh — zero new warnings.
Run timeout 150 cargo st — spec tests green.
Verify no #[ignore] annotations introduced anywhere in compiler_repo/compiler/ori_llvm/tests/aot/ by this plan (the test_generic_method_on_generic_type un-ignore from §03.2 should be the only delta).
Verify no #skip annotations introduced in compiler_repo/tests/spec/ related to this plan’s work.
state.sh refresh --dispositions-only && state.sh show --json | jq '.test_dispositions.totals.untracked' returns 0.
Subsection close-out (06.4) — status: complete:
- All checks above pass.
- Run /improve-tooling retrospectively on §06.4.
- Run compiler_repo/diagnostics/repo-hygiene.sh --check.

06.R Third Party Review Findings

Populated by /tpr-review at §06.N (this section’s TPR catches integration bugs across §02–§05).

06.N Completion Checklist

All 06.1, 06.2, 06.3, 06.4 subsections status: complete.
/tpr-review clean across reviewer set.
/impl-hygiene-review clean after TPR.
python -m scripts.plan_corpus check plans/aot-mono-completeness/section-06-test-matrix.md exit 0.
Section frontmatter flipped to status: complete, reviewed: true.

Section 06: Test Matrix + Dual-Execution Parity

Intelligence Reconnaissance

06.1 Cross-facet interaction tests

06.2 Dual-execution parity sweep

06.3 Leak verification under ORI_CHECK_LEAKS

06.4 Whole-suite regression check

06.R Third Party Review Findings

06.N Completion Checklist