Section 03: KnownSafe Nested Pair Elimination

Status: Not Started Goal: Add a post-emission optimization pass that tracks physical refcount positivity per variable and eliminates inner RcInc/RcDec pairs that are nested within outer pairs guaranteeing the refcount stays positive. This is the LLVM KnownSafe optimization adapted to Ori’s typed ARC IR.

Context: When a variable has refcount known to be > 1 (because an outer RcInc is in scope and no intervening operation could decrement it to zero), inner RcInc/RcDec pairs on the same variable are provably redundant — the refcount can never hit zero regardless of any intervening side effects. LLVM’s ARC optimizer uses the KnownSafe flag on PtrState for this: once a retain is seen and no release has occurred, the refcount is “known positive” and nested pairs can be eliminated.

Critical design choice (from Codex): This is NOT about AIMS Cardinality. Cardinality::Once means “used once more in the future” — that’s demand, not proof of positive refcount right now. KnownSafe is a physical fact about the runtime refcount at a specific program point, computed post-emission from the actual RcInc/RcDec instructions in the IR.

Reference implementations:

LLVM PtrState.h:55-95: RRInfo::KnownSafe — “KnownSafe is true when the refcount is known to be positive, either because we’re inside a retain/release pair on the same pointer, or because nested retain/release pairs exist.”
LLVM ObjCARCOpts.cpp:1812-1960: PairUpRetainsAndReleases() uses KnownSafeTD (top-down from retain) and KnownSafeBU (bottom-up from release) — both must be true for elimination.
Swift RefCountState.h:55-186: KnownSafe + LatticeState enum (None | Decremented | MightBeUsed | MightBeDecremented).

Depends on: Section 01 (statistics to measure elimination), Section 02 (selective barriers change which RC ops reach this pass).

03.1 Define Physical Refcount State Model

File(s): compiler/ori_arc/src/aims/knownsafe/mod.rs (new module)

Define the per-variable physical refcount state tracked during post-emission analysis. This is distinct from the AIMS lattice — it tracks the physical refcount state of already-emitted RC operations.

Create compiler/ori_arc/src/aims/knownsafe/ directory with mod.rs
Add pub mod knownsafe; to compiler/ori_arc/src/aims/mod.rs (alongside the existing pub mod verify; etc.) and add a //! doc line to the module doc header describing the new module.

Define PhysicalRcState:

/// Physical refcount state for a single variable at a program point.
///
/// Tracks whether the refcount is known to be positive (> 0) based on
/// the emitted RcInc/RcDec instructions. This is a runtime fact, not
/// a semantic demand (unlike AIMS Cardinality).
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum PhysicalRcState {
    /// Refcount state is unknown (conservative).
    Unknown,
    /// Refcount is known to be positive (> 0) because we're inside
    /// an RcInc bracket that hasn't been decremented.
    Positive,
    /// Refcount may have been decremented (an RcDec was seen without
    /// a subsequent RcInc to restore positivity).
    MaybeDecremented,
}

Define VarRcInfo:

/// Per-variable refcount tracking info at a program point.
#[derive(Clone, Debug)]
pub struct VarRcInfo {
    /// Current physical refcount state.
    pub state: PhysicalRcState,
    /// Whether inner RcInc/RcDec pairs can be safely eliminated.
    /// True when `state == Positive` — the refcount can never hit zero
    /// regardless of inner operations.
    pub known_safe: bool,
    /// Net refcount delta since last "reset" point (function entry or
    /// control flow merge). Used to track nesting depth.
    pub net_delta: i32,
}

Define monotonic merge for control flow joins:

impl VarRcInfo {
    /// Conservative merge at control flow join points.
    /// KnownSafe is only true if ALL predecessors agree.
    pub fn merge(&self, other: &VarRcInfo) -> VarRcInfo {
        VarRcInfo {
            state: match (self.state, other.state) {
                (PhysicalRcState::Positive, PhysicalRcState::Positive) => PhysicalRcState::Positive,
                _ => PhysicalRcState::Unknown,
            },
            known_safe: self.known_safe && other.known_safe,
            net_delta: self.net_delta.min(other.net_delta),
        }
    }
}

Unit tests for PhysicalRcState and VarRcInfo::merge().
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (03.1) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-03.1 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 03.1: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

03.2 Implement Forward Refcount Tracking

File(s): compiler/ori_arc/src/aims/knownsafe/analysis.rs (new file)

Walk the post-emission ARC IR forward (top-down through instructions), tracking VarRcInfo per variable per block. Note: despite Clang’s “bottom-up” terminology for its release-scanning pass, this analysis walks instructions forward to track refcount positivity state.

Define KnownSafeAnalysis:

pub struct KnownSafeAnalysis {
    /// Per-block, per-variable refcount info at block exit.
    block_states: FxHashMap<ArcBlockId, FxHashMap<ArcVarId, VarRcInfo>>,
}

Implement forward walk per block (uses Section 02’s callee_may_observe_rc()).

Cross-module dependency: callee_may_observe_rc() is defined in aims/emit_rc/coalesce/mod.rs (Section 02). The KnownSafe analysis needs to call it from aims/knownsafe/analysis.rs. Either:

(a) Make callee_may_observe_rc() pub(crate) and re-export via aims/emit_rc/mod.rs, or
(b) Extract it to a shared location (e.g., aims/emit_rc/barrier_query.rs) that both coalesce and knownsafe can import. Option (a) is simpler. Add re-export: pub(crate) use coalesce::callee_may_observe_rc; in aims/emit_rc/mod.rs.

fn analyze_block(
    &mut self,
    func: &ArcFunction,
    block_idx: usize,
    contracts: &FxHashMap<Name, MemoryContract>,
) {
    let block = &func.blocks[block_idx];
    let mut var_states: FxHashMap<ArcVarId, VarRcInfo> = self.entry_state(block_idx);
    
    for instr in &block.body {
        match instr {
            ArcInstr::RcInc { var, count, .. } => {
                let info = var_states.entry(*var).or_default();
                // Note: `count` can be > 1 (batched by coalescing).
                info.net_delta += *count as i32;
                if info.net_delta > 0 {
                    info.state = PhysicalRcState::Positive;
                    info.known_safe = true;
                }
            }
            ArcInstr::RcDec { var, .. } => {
                // Note: RcDec has no count field — each RcDec is 1.
                // Multiple decs are emitted as separate instructions.
                let info = var_states.entry(*var).or_default();
                info.net_delta -= 1;
                if info.net_delta <= 0 {
                    info.state = PhysicalRcState::MaybeDecremented;
                    info.known_safe = false;
                }
            }
            // Calls: use Section 02's callee_may_observe_rc() to determine
            // which variables' KnownSafe survives across the call.
            ArcInstr::Apply { func, args, .. } => {
                let contract = contracts.get(func);
                for (var, info) in var_states.iter_mut() {
                    // If the variable is passed as an arg to this call,
                    // check if the callee can observe its RC.
                    let is_arg = args.iter().enumerate().any(|(idx, &a)| {
                        a == *var && callee_may_observe_rc(contract, idx)
                    });
                    if is_arg {
                        info.known_safe = false;
                        info.state = PhysicalRcState::Unknown;
                    }
                    // Variables NOT passed as args, or passed to Borrowed
                    // params, retain their KnownSafe status.
                }
            }
            ArcInstr::ApplyIndirect { .. } => {
                // No contract for indirect calls — conservative reset.
                for info in var_states.values_mut() {
                    info.known_safe = false;
                    info.state = PhysicalRcState::Unknown;
                }
            }
            _ => {}
        }
    }
    
    self.block_states.insert(ArcBlockId(block_idx), var_states);
}

Handle control flow merges using VarRcInfo::merge().

Fixpoint convergence: For loops (back-edges), the analysis must iterate until the state stabilizes. Use a worklist algorithm: add all blocks to the worklist initially, and re-add a block’s successors when its exit state changes.
Thread contracts: &FxHashMap<Name, MemoryContract> into KnownSafeAnalysis — needed by callee_may_observe_rc() in the Apply handler. This is the same contracts map from AimsPipelineConfig.contracts.
Unit tests: simple linear block, diamond CFG, loop (verifies fixpoint convergence).
TPR checkpoint — /tpr-review covering 03.1–03.2 implementation work
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (03.2) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-03.2 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 03.2: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

03.3 Implement KnownSafe Pair Elimination Pass

File(s): compiler/ori_arc/src/aims/knownsafe/eliminate.rs (new file)

Use the analysis results to eliminate redundant inner RC pairs.

Define eliminate_knownsafe_pairs():

/// Remove RcInc/RcDec pairs where KnownSafe is true for the variable.
///
/// A pair is eliminable when:
/// 1. An RcInc(x) is followed by RcDec(x) in the same block
/// 2. x.known_safe is true at the RcInc point (refcount known > 0)
/// 3. No instruction between them can make x's refcount hit zero
///
/// Condition 3 is guaranteed by KnownSafe — if the refcount is known
/// positive, it cannot hit zero regardless of intervening operations.
pub fn eliminate_knownsafe_pairs(
    func: &mut ArcFunction,
    analysis: &KnownSafeAnalysis,
) -> usize  // returns number of pairs eliminated

Elimination algorithm:
1. Walk each block forward
2. At each RcInc(x), check if x.known_safe at this point
3. If yes, scan forward for matching RcDec(x) in same block
4. If found with no intervening use that requires positive refcount, eliminate both
5. Track eliminated count for statistics
Unit tests (TDD: write BEFORE implementing eliminate_knownsafe_pairs() — tests must fail initially, then pass after implementation):
- Nested RcInc(x); RcInc(x); use(x); RcDec(x); RcDec(x) → inner pair eliminated
- RcInc(x); call(); RcDec(x) → NOT eliminated (call may observe)
- RcInc(x); RcInc(x); call(); RcDec(x); RcDec(x) with call barrier reset → inner pair NOT eliminated
- Two separate variables, only one known_safe → correct variable’s pair eliminated

Semantic pin: Test that creates nested inc/dec pattern, verifies inner pair is eliminated, and the resulting program still runs correctly with ORI_CHECK_LEAKS=1.

Negative pin: Test that a non-nested pattern (RcInc(x); call_that_may_dec(x); RcDec(x)) is NOT eliminated — the call breaks the positivity guarantee. Assert knownsafe_pairs_eliminated == 0 for this case. A second negative pin: test that RcInc(x); RcDec(x) at nesting depth 1 (no outer bracket) is NOT eliminated — only INNER pairs within a known-positive bracket are safe to remove.

TPR checkpoint — /tpr-review covering 03.3 elimination pass implementation
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (03.3) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-03.3 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 03.3: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

03.4 Integration and Matrix Testing

File(s): compiler/ori_arc/src/pipeline/aims_pipeline.rs, tests

Integrate the KnownSafe pass into the AIMS pipeline and verify correctness.

03.R Third Party Review Findings

None.

03.N Completion Checklist

Exit Criteria: ORI_LOG=ori_arc=info ori build shows knownsafe_pairs_eliminated > 0 for programs with nested function calls on heap-typed values. All existing tests pass. ORI_CHECK_LEAKS=1 reports zero leaks. Dual-exec parity maintained.