Section 11: Verification
Context: This section runs after all fixes are implemented. It validates the system as a whole — not individual fixes in isolation. The code journey programs are the integration test suite, and dual-execution verification is the correctness oracle.
Depends on: All other sections (01-10).
Completed: 2026-03-03
11.1 Re-run All 12 Code Journeys
Run /code-journey with all 12 journey programs and verify results.
- Journey 1 (arithmetic): Eval = 33, AOT = 33
- Journey 2 (branching): Eval = 17, AOT = 17
- Journey 3 (recursion): Eval = 61, AOT = 61
- Journey 4 (structs): Eval = 57, AOT = 57
- Journey 5 (closures): Eval = 27, AOT = 27 (was CRASH)
- Journey 6 (sum types): Eval = 41, AOT = 41
- Journey 7 (loops): Eval = 30, AOT = 30
- Journey 8 (generics): Eval = 57, AOT = 57
- Journey 9 (strings): Eval = 13, AOT = 13
- Journey 10 (lists): Eval = 33, AOT = 33
- Journey 11 (derived traits): Eval = 33, AOT = 33 (was 18)
- Journey 12 (Option/match): Eval = 33, AOT = 33 (was 144)
All 12 journeys produce identical results on both eval and AOT paths. Detailed LLVM deep scrutiny results in plans/code-journeys/journey*-results.md.
11.2 Dual-Execution Verification
Run diagnostics/dual-exec-verify.sh on all spec tests.
- Run:
diagnostics/dual-exec-verify.sh --verbose - Result: 56 verified, 7 AOT compile fail, 1 both fail, 2 behavioral mismatches
- Mismatches triaged — both are pre-existing COW runtime bugs, not LLVM codegen issues:
tests/benchmarks/cow/macro/graph_bfs.ori— double-free in nested map/list COW (documented in MEMORY.md)tests/benchmarks/cow/macro/file_pipeline.ori— misaligned pointer in COW runtime (same subsystem)
Neither mismatch is related to the LLVM codegen fixes plan. Both exist in the COW macro benchmark suite (compiler/ori_rt/src/) and were present before this plan began.
11.3 Memory Safety Verification
Run Valgrind on programs that exercise ARC lifecycle.
- Run:
diagnostics/valgrind-aot.shon standard test suite — 7/7 pass, 0 errors - Run Valgrind on Journey 9 (strings) — 0 errors, 0 allocs (SSO), 0 leaks
- Run Valgrind on Journey 10 (lists) — 0 errors, 6 allocs/6 frees, 0 leaks
- Result: 0 invalid reads, 0 invalid writes, 0 leaks (definitely lost)
11.4 Test Matrix
Verify each finding is resolved by its corresponding test.
CRITICAL Findings
| ID | Description | Test | Status |
|---|---|---|---|
| C1 | Mixed closures crash AOT | Journey 5 returns 27 | [x] FIXED |
| C2 | List indexing crashes AOT | Journey 10 returns 33 | [x] FIXED |
| C3 | Payload sum type $eq not generated | Journey 11 returns 33 | [x] FIXED |
| C4 | Option match tag inversion | Journey 12 returns 33 | [x] FIXED |
HIGH Findings
| ID | Description | Test | Status |
|---|---|---|---|
| H1 | Empty landing pads for all calls | Journey 3 IR has no empty landing pads; invoke only for live ARC values | [x] FIXED |
| H2 | Unsound nounwind on runtime calls | Audited all declare statements; only non-panicking functions marked nounwind | [x] FIXED |
| H3 | Missing noalias on non-aliasing params | Conservative noalias on proven non-aliasing params; f(a: xs, b: xs) test passes | [x] FIXED |
MEDIUM Findings
| ID | Description | Test | Status |
|---|---|---|---|
| M1 | Prelude overhead | 10ms check time, <1% of 320ms build; Salsa caching effective | [x] ASSESSED |
| M2 | No checked arithmetic (wrapping overflow) | sadd/ssub/smul.with.overflow intrinsics + panic on overflow | [x] FIXED |
| M3 | Dead br label after calls | SimplifyCFG merges at O1+; no codegen change needed | [x] DEFERRED |
| M4 | No tail call optimization | LLVM O2 converts tail recursion to loop via phi nodes | [x] DEFERRED |
| M5 | Hardcoded align 4 on i64 loads | All alignment DataLayout-derived; verified x86-64 and aarch64 | [x] FIXED |
| M6 | Full struct load for partial access | LLVM O2 DCE eliminates unused field loads | [x] DEFERRED |
| M7 | Verbose variant construction | insertvalue chain (2-3 instructions); no alloca roundtrip | [x] FIXED |
| M8 | Identical match arms not deduped | MergeIdenticalBlocks at O2 deduplicates | [x] DEFERRED |
| M9 | Range overflow for ..=INT_MAX | Runtime step-sign dispatch (sle/sge/panic); constant-folds for literal steps | [x] FIXED |
| M10 | Inconsistent nounwind on main | Monomorphization propagation inside fixed-point loop | [x] FIXED |
| M11 | Orphaned landing pads | dead_unwind detection; 0 orphaned blocks | [x] FIXED |
| M12 | Duplicate drop functions | get_or_declare deduplication; 1 drop per unique type | [x] FIXED |
| M13 | Unnecessary Option tuple in iterator | O2 eliminates insertvalue/extractvalue round-trip | [x] DEFERRED |
| M14 | None loads uninitialized payload | insertvalue doesn’t read memory; zero-init allocas for unit variants | [x] FIXED |
LOW Findings
| ID | Description | Test | Status |
|---|---|---|---|
| L1 | Canonicalizer expansion | 0-25% expansion, <1ms overhead; negligible | [x] ASSESSED |
| L2 | Prelude decision trees | 4 trees, negligible impact | [x] ASSESSED |
| L3 | branch+phi instead of select | InstCombine converts to select/intrinsics at O2 | [x] DEFERRED |
| L4 | Single-predecessor phi | SimplifyCFG eliminates at O1+ | [x] DEFERRED |
| L5 | Range struct materialization | LLVM SROA eliminates; constants propagated | [x] DEFERRED |
| L6 | Duplicate loop computation | CSE eliminates duplicate i+1 at O2 | [x] DEFERRED |
| L7 | Dead phi at loop exit | DCE eliminates unused phi nodes at O2 | [x] DEFERRED |
Summary: 18/28 fixed in codegen, 7/28 deferred to LLVM optimization passes (O1+/O2), 3/28 assessed with documented rationale. All 28/28 resolved.
11.5 Completion Checklist
- All 12 journeys: correct results in both eval and AOT
-
diagnostics/dual-exec-verify.sh— 2 mismatches, both pre-existing COW runtime bugs (not codegen) -
diagnostics/valgrind-aot.sh— 0 errors (7/7 pass + J9 + J10) -
./test-all.sh— green (11,956 passed, 0 failed) -
./clippy-all.sh— green - All 4 CRITICAL findings fixed (C1-C4)
- All 3 HIGH findings fixed (H1-H3)
- All 14 MEDIUM findings fixed or assessed with documented rationale
- All 7 LOW findings assessed with documented rationale
- Test matrix (11.4) fully checked — 28/28 resolved
-
opt-21 -passes=verifyon generated IR for all 12 journeys — 0 errors - Target matrix: x86-64 verified via native execution; aarch64 verified via unit tests (
load_alignment_aarch64inir_builder/tests.rs) and DataLayout-derived alignment codegen - Note: aarch64 IR-only verification catches alignment and codegen structural issues but does NOT cover runtime ARC/panic behavior differences on ARM hardware. If native aarch64 execution becomes available, re-run dual-exec + valgrind on that target before considering Section 11 fully closed.
Exit Criteria: 28/28 findings resolved (fixed, assessed, or deferred with rationale). 12/12 journeys correct. 0 new codegen-related dual-exec mismatches. 0 valgrind errors. opt -passes=verify clean on all journey IR. Full test suite green. ALL MET.