Section 05: ARC IR-Level Verification
Status: Not Started
Goal: Add a verification pass to ori_arc that checks structural invariants of the ARC IR before LLVM lowering. This catches bugs at the level where all information is available (Construct = +1, RcInc = +1, RcDec = -1, type info, ownership annotations) — eliminating the need for heroic recovery at the LLVM IR level.
Context: The current verification happens at two levels: (1) the Rust verify/ module walks live inkwell objects after LLVM lowering, and (2) the Python tools parse textual LLVM IR. Both suffer from information loss — by the time we’re looking at LLVM IR, Construct has become ori_rc_alloc + GEP + store sequences, and ownership annotations are gone. Lean 4’s Checker.lean validates their LCNF IR before lowering, where everything is explicit. We should do the same.
ARC IR structure (from compiler/ori_arc/src/ir/): ArcFunction contains params: Vec<ArcParam>, blocks: Vec<ArcBlock>, var_types: Vec<Idx>. Each ArcBlock has id: ArcBlockId, params: Vec<(ArcVarId, Idx)>, body: Vec<ArcInstr> (note: body, not instructions), and terminator: ArcTerminator.
Note: This section is Rust code in ori_arc, not Python. It’s included in this plan because it’s the architecturally correct solution to the false positive problem, even though it’s independent of the Python tooling.
Reference implementations:
- Lean 4
Compiler/IR/Checker.lean: Validates variable scope/uniqueness, type consistency, jump target validity, external function existence. ~200 lines, runs in O(n). - Rust
rustc_mir_transform/src/check_alignment.rs: Modular pass,run_pass()+is_enabled()pattern. - Koka
Core/CheckFBIP.hs: Property verification (FBIP compliance) on top of a working RC system. Produces warnings, not errors.
Depends on: Nothing (independent of Python tooling).
05.1 Structural Invariants
File(s): compiler/ori_arc/src/verify/mod.rs (new module) + compiler/ori_arc/src/verify/tests.rs
The module structure:
verify/mod.rs— publiccheck_function()entry point + structural checksverify/tests.rs— test cases (sibling tests pattern per CLAUDE.md)
Must also add pub(crate) mod verify; to compiler/ori_arc/src/lib.rs (between mod pipeline; and pub mod rc_elim;, alphabetically). Use pub(crate) since verification is internal — external consumers should not call it directly. If needed for cargo test -p ori_arc, make check_function pub.
Verify structural properties of ArcFunction that must hold after lowering from CanExpr:
- Variable scope: Every
ArcVarIdused in an instruction must be defined (by a preceding instruction, as a function parameter, or as a block parameter) before use. No use-before-def.fn check_variable_scope(func: &ArcFunction) -> Vec<VerifyError> { // func.params is Vec<ArcParam>, each has a .var: ArcVarId field let mut defined: FxHashSet<ArcVarId> = func.params.iter().map(|p| p.var).collect(); let mut errors = Vec::new(); for block in &func.blocks { // Block params are also definitions (phi-like values from Jump args) for (var, _ty) in &block.params { defined.insert(*var); } // ArcBlock uses .body, not .instructions for instr in &block.body { for used in instr.used_vars() { if !defined.contains(&used) { errors.push(VerifyError::UseBeforeDef { var: used, block: block.id }); } } if let Some(dst) = instr.defined_var() { defined.insert(dst); } } // Terminators also use variables (e.g., Branch condition, Return value). for used in block.terminator.used_vars() { if !defined.contains(&used) { errors.push(VerifyError::UseBeforeDef { var: used, block: block.id }); } } // Invoke defines a dst in the normal successor — handled via // that block's param definitions (Jump args carry it). } errors } // NOTE: This flat-set approach is SOUND for SSA-form IR because each // ArcVarId is defined exactly once across the entire function. A use // in block B that references a definition in block A is valid if A // dominates B. Since we're checking ALL definitions globally (not // per-block), a non-dominating definition would still pass. For a // stricter check, use the dominator tree — but that's Phase 2 work.
NOTE (known limitation): The flat-set approach is an over-approximation. It will not catch use-before-def where the definition exists but does not dominate the use (e.g., variable defined in one branch, used in the other branch’s successor). Since
ori_arcalready has aDominatorTreeingraph/, a future Phase 2 enhancement can use it for precise checking. Document this limitation in the code with a// TODO(verify): ...comment.
-
Block connectivity: Every block referenced in a terminator (Jump, Branch, Switch, Invoke) must exist in the function. No dangling block references. The
ArcTerminatorenum has these block-referencing variants:Jump { target }— 1 targetBranch { then_block, else_block }— 2 targetsSwitch { cases, default }— N+1 targetsInvoke { normal, unwind }— 2 targetsReturn,Resume,Unreachable— 0 targets
-
Terminator presence: Every block must end with exactly one terminator. No blocks without terminators, no terminators in the middle of a block.
-
Type consistency: If
var_reprsmaps a variable to aValueRepr, RC operations on that variable must be type-compatible (e.g.,RcInc/RcDecshould never operate on variables withValueRepr::Scalar— those have no RC header). Note: useValueRepr(per-variable representation fromvar_reprs), notArcClass(higher-level type classification fromclassify/).
05.2 RC Balance by Construction
File(s): compiler/ori_arc/src/verify/mod.rs
Verify RC balance properties after the RC insertion pass:
-
No RC on scalars: After
compute_var_reprs(),RcInc/RcDecmust never operate on variables withValueRepr::Scalarinvar_reprs. This would indicate a misclassification bug described in.claude/rules/arc.md. -
Ownership consistency: If a parameter has
ownership == Ownership::Borrowed(available directly onArcParam), noRcDecshould target its.varwithin the function body (borrowed values are not owned, so must not be freed). Implementation:let borrowed_params: FxHashSet<ArcVarId> = func.params.iter() .filter(|p| p.ownership == Ownership::Borrowed) .map(|p| p.var) .collect(); // Then for each RcDec { var, .. } in all blocks: // if borrowed_params.contains(&var) → VerifyError::DecOnBorrowed -
Invoke dst scoping:
ArcTerminator::Invoke { dst, .. }defines a variable that should be visible in thenormalsuccessor block but NOT in theunwindblock. The scope checker must adddsttodefinedonly for the normal successor path, not globally. This is a subtlety that the flat-set approach (see 05.1) does NOT handle correctly — it would adddstglobally. For correctness, either:- Accept the flat-set approach with a documented over-approximation (no false positives, possible false negatives), OR
- Use per-block defined sets with dominator-tree propagation (more complex, fully correct)
-
Return value ownership: If a function returns an
Ownedvalue, there must be a path from allocation/inc to the return without a matching dec. -
No double-dec: After RC elimination, no variable should have more
RcDecoperations thanRcInc + allocationoperations on any execution path.
05.3 Pipeline Integration
File(s): compiler/ori_arc/src/pipeline.rs (where run_arc_pipeline() lives; lib.rs just re-exports it)
Integrate the verifier into run_arc_pipeline() as a debug-only check:
-
Add
verify_arc_function()call after RC insertion and after RC elimination:// In run_arc_pipeline() (pipeline.rs): // ... existing passes ... crate::rc_insert::insert_rc_ops_with_ownership(&mut func, ...); #[cfg(debug_assertions)] { let errors = verify::check_function(&func); if !errors.is_empty() { // Log errors but don't panic — this is diagnostic, not blocking for e in &errors { tracing::warn!("ARC IR verification: {e:?}"); } } } crate::rc_elim::eliminate_rc_ops_dataflow(&mut func, ...); // ... verify again after elimination ... -
Gate behind
debug_assertions(zero cost in release builds) -
Add
ORI_VERIFY_ARC=1environment variable for explicit opt-in in release builds (follows existingORI_AUDIT_CODEGENpattern — acceptable for diagnostic tooling, not production passes) -
Add tests in
compiler/ori_arc/src/verify/tests.rs
05.4 Completion Checklist
-
compiler/ori_arc/src/verify/mod.rsexists with structural checks -
pub(crate) mod verify;added tocompiler/ori_arc/src/lib.rs - Variable scope, block connectivity, terminator presence verified
- RC-on-scalar detection works (catches
Scalarmisclassification) - Ownership consistency checked (no dec on borrowed params)
- Integrated into
run_arc_pipeline()underdebug_assertions - All existing tests pass:
cargo test -p ori_arc - Verifier produces zero warnings on all 12 journey programs
-
./test-all.shgreen
Exit Criteria: ORI_VERIFY_ARC=1 cargo test -p ori_llvm runs the ARC IR verifier on all AOT test programs with zero warnings. Any genuine RC bugs are caught at the ARC IR level before they become mysterious imbalances at the LLVM IR level.