Section 08: Escape Analysis & Stack Promotion

Context: Ori’s ARC system (ori_arc) already has liveness analysis and borrow inference. The missing piece is escape analysis — determining whether a reference-counted value’s pointer is ever stored in a location that outlives the current function (return value, closure capture, global, or another heap object’s field).

The ori_arc pipeline currently uses liveness-based analysis for RC insertion, but this is conservative: it treats all heap-allocated values as potentially escaping. True escape analysis would identify values that can be stack-promoted.

Reference implementations:

Go cmd/compile/internal/escape/: Connection graph-based escape analysis — tracks dataflow from allocations to escape points
Swift lib/SILOptimizer/Transforms/StackPromotion.cpp: Walks SIL to check if alloc_ref escapes
Java HotSpot macro.cpp: Scalar replacement — replaces heap object with individual fields on stack
Lean4 Borrow.lean: Parameter ownership inference that implicitly identifies non-escaping borrows

Depends on: §02 (triviality classification helps escape analysis — trivial values don’t need escape tracking).

Risk warning (VERY HIGH COMPLEXITY): This is the largest (~1,500 lines) and most dangerous section. Key risks:

Connection graph escape analysis is interprocedural — requires whole-module fixed-point iteration that interacts with ori_arc’s existing borrow inference.
Stack promotion (§08.3) changes allocation semantics — a bug means use-after-free. Requires the most thorough Valgrind testing of any section.
Bump allocation (§08.4) adds a new runtime allocation scheme to ori_rt that must integrate with existing COW, slice, and RC infrastructure.
§08.5 touches the run_arc_pipeline() / AIMS integration seam — the most sensitive function in the compiler.

Recommended approach: Implement §08.1 (intraprocedural) first as a standalone pass. Ship it, measure, and verify with Valgrind before attempting §08.2 (interprocedural) or §08.4 (bump allocation).

08.1 Intraprocedural Escape Analysis

File(s): compiler/ori_repr/src/escape/mod.rs, compiler/ori_repr/src/escape/intraprocedural.rs

Start with per-function analysis (no cross-function information). This catches the most common patterns: temporary collections, intermediate strings, local structs.

Define escape lattice:

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
pub enum EscapeState {
    /// Value definitely does not escape the function
    NoEscape,
    /// Value escapes to a callee but callee doesn't retain it (borrow)
    ArgEscape,
    /// Value escapes the function (returned, stored in global, captured by closure)
    GlobalEscape,
}

Implement connection graph:

pub struct ConnectionGraph {
    /// Node per allocation site + parameter + return
    nodes: Vec<CgNode>,
    /// Edges: PointsTo (field → object), Deferred (alias)
    edges: Vec<CgEdge>,
}

pub enum CgNode {
    /// Allocation site (heap object creation)
    Alloc { id: AllocId, escape: EscapeState },
    /// Function parameter (escape state depends on callers)
    Param { index: usize, escape: EscapeState },
    /// Function return (always GlobalEscape)
    Return,
    /// Phantom node for unknown destinations
    Unknown,
}

pub enum CgEdge {
    /// a.field points to b
    PointsTo { from: NodeId, field: u32, to: NodeId },
    /// a defers to b (alias — same object)
    Deferred { from: NodeId, to: NodeId },
}

Implement escape propagation:

pub fn analyze_escapes(func: &ArcFunction, pool: &Pool) -> EscapeInfo {
    let mut graph = build_connection_graph(func, pool);

    // Fixed-point: propagate escape states through edges
    let mut changed = true;
    while changed {
        changed = false;
        for edge in &graph.edges {
            let (from_escape, to_escape) = match edge {
                PointsTo { from, to, .. } => (graph.escape(*from), graph.escape(*to)),
                Deferred { from, to } => (graph.escape(*from), graph.escape(*to)),
            };
            // If destination escapes, source must also escape
            let merged = from_escape.max(to_escape);
            if merged > graph.escape(edge.source()) {
                graph.set_escape(edge.source(), merged);
                changed = true;
            }
        }
    }

    EscapeInfo::from_graph(graph)
}

Escape sources (what causes GlobalEscape):
- Value is returned from function
- Value is stored in a mutable reference parameter
- Value is captured by a closure that escapes
- Value is stored in a global variable
- Value is passed as an owned parameter to an unknown function
- Value is stored in a heap object’s field (that object escapes)
Non-escape sinks (what keeps NoEscape):
- Value is only read (not stored)
- Value is passed as a borrowed parameter to a known function
- Value is consumed (last use) within the function
- Value is used in pattern matching then discarded
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (08.1) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.1 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.1: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

08.2 Interprocedural Escape Analysis

File(s): compiler/ori_repr/src/escape/interprocedural.rs

Cross-function escape analysis uses function summaries to track which parameters escape.

Define function escape summaries:

pub struct FunctionEscapeSummary {
    /// Which parameters escape?
    pub param_escapes: Vec<EscapeState>,
    /// Does the return value contain any input parameters?
    pub return_aliases: Vec<usize>, // param indices
}

Compute summaries bottom-up through the call graph:
- Leaf functions (no callees): direct analysis
- Non-leaf functions: use callee summaries to refine escape states
- Recursive functions: conservative (assume all params escape) then refine

Apply summaries at call sites:

// At call site: f(x, y, z)
// If f's summary says param 0 doesn't escape:
//   → x does NOT escape through this call
// If f's summary says param 1 escapes:
//   → y DOES escape through this call

Integration with ori_arc borrow inference:
- The borrow inference in ori_arc::borrow already computes per-parameter ownership (Borrowed/Owned)
- Borrowed ≈ ArgEscape (callee uses but doesn’t retain)
- Owned ≈ GlobalEscape (callee may retain)
- Unify these two analyses to avoid redundant computation
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (08.2) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.2 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.2: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

08.3 Stack Promotion Codegen

File(s): compiler/ori_llvm/src/codegen/arc_emitter/construction.rs (where ori_rc_alloc calls are emitted — replace with alloca for non-escaping values), compiler/ori_llvm/src/codegen/arc_emitter/alloc.rs (new file — allocation strategy dispatch)

When escape analysis marks an allocation as NoEscape, generate stack allocation instead of heap.

Replace ori_rc_alloc with alloca:

; Before (heap):
%ptr = call ptr @ori_rc_alloc(i64 24, i64 8)
; ... use ptr ...
call void @ori_rc_dec(ptr %ptr, ptr @_ori_drop$42)

; After (stack):
%ptr = alloca [24 x i8], align 8
; ... use ptr ...
; No rc_dec needed — stack memory is freed automatically

Eliminate all RC operations for stack-promoted values:
- No ori_rc_inc (refcount is meaningless on stack)
- No ori_rc_dec (no free needed)
- No drop function call (fields are dropped individually at function exit)
Handle non-trivial fields in stack-promoted values:
- If the struct has RC’d fields (e.g., struct { name: str, age: int }):
  - The struct itself is on the stack (no RC header)
  - The str field still has its own RC (it’s a separate heap allocation)
  - At function exit, ori_rc_dec the str field (but not the struct)
Lifetime extension for stack-promoted values:
- If the value is live across a function call, the alloca must dominate the call
- LLVM’s alloca in the entry block is lifetime-safe
- Use llvm.lifetime.start / llvm.lifetime.end intrinsics for precise scoping
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (08.3) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.3 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.3: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

08.4 Bump Allocation for Non-Escaping Dynamic Values

File(s): compiler/ori_repr/src/escape/bump.rs, compiler/ori_llvm/src/codegen/arc_emitter/alloc.rs

Stack promotion (§08.3) works for fixed-size values, but not for dynamic-size collections (lists, maps, strings with unknown length). These still need heap allocation — but they don’t need malloc/free if they don’t escape.

Strategy: Emit a function-local bump allocator for non-escaping dynamic values. Bump allocation is a pointer increment — faster than any general-purpose allocator. The entire bump region is freed at function return (single free or stack unwinding).

This directly closes the “custom allocators” gap with Zig: Zig lets programmers manually pass arena allocators. Ori does it automatically based on escape analysis.

Define bump allocation decision:

pub enum AllocStrategy {
    /// Standard heap allocation via ori_rc_alloc
    Heap,
    /// Stack allocation via alloca (fixed-size, NoEscape)
    Stack,
    /// Bump allocation from function-local arena (dynamic-size, NoEscape)
    Bump,
}

Select strategy based on escape state and size:
- NoEscape + fixed-size → Stack (§08.3)
- NoEscape + dynamic-size → Bump
- ArgEscape or GlobalEscape → Heap

Emit bump allocator prologue/epilogue in LLVM IR:

; Function prologue — allocate bump region
%bump.base = call ptr @ori_bump_alloc(i64 4096)  ; initial 4KB region
%bump.ptr = alloca ptr                            ; current bump pointer
store ptr %bump.base, ptr %bump.ptr

; Bump allocation (instead of ori_rc_alloc):
%current = load ptr, ptr %bump.ptr
%next = getelementptr i8, ptr %current, i64 %size
store ptr %next, ptr %bump.ptr
; %current is the allocated pointer — no RC header needed

; Function epilogue — free entire region
call void @ori_bump_free(ptr %bump.base)

Handle growth: if bump region is exhausted, allocate a new linked region. The ori_bump_alloc / ori_bump_free functions in ori_rt manage a linked list of regions.
No RC operations: Bump-allocated values have no refcount header. All RC inc/dec/is_unique operations are elided (same as stack-promoted values).
Interaction with COW: Bump-allocated collections are always unique (no sharing possible), so all COW checks are StaticUnique — fast path only. This combines with VSO §07 for maximum effect.
Unit tests:
- Function with temporary list (unknown size from input) → bump-allocated, no ori_rc_alloc
- Function returning list → standard heap allocation (escapes)
- Bump region growth: function allocating >4KB of temporaries → linked regions, single cleanup
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (08.4) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.4 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.4: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

08.5 Escape-Aware ARC Pipeline Integration

File(s): compiler/ori_arc/src/pipeline/aims_pipeline.rs (AimsPipelineConfig), compiler/ori_arc/src/aims/emit_rc/ (RC emission), compiler/ori_arc/src/lib.rs (pipeline entry)

Feed escape information into the ARC pipeline so it can skip RC operations.

Call site warning: run_arc_pipeline() is currently invoked directly from compiler/ori_arc/src/tests.rs and compiler/ori_llvm/src/codegen/function_compiler/define_phase.rs; run_arc_pipeline_all() in compiler/ori_arc/src/pipeline/mod.rs is the internal batch wrapper that must stay in sync. If this section changes the public signature, update every direct caller in the same commit.

Add EscapeInfo to run_arc_pipeline() parameters:
```
// Current signature (ori_arc/src/pipeline/mod.rs):
pub fn run_arc_pipeline(
    func: &mut ArcFunction,
    classifier: &dyn ArcClassification,
    sigs: &FxHashMap<Name, AnnotatedSig>,
    pool: &Pool,
    interner: &ori_ir::StringInterner,
    uniqueness_summaries: &FxHashMap<Name, UniquenessSummary>,
    aims_contracts: &FxHashMap<Name, MemoryContract>,
    verify_arc: bool,
    // escape_info: &EscapeInfo,  // NEW — adds a 9th parameter
) -> Vec<ArcProblem> { ... }
```
WARNING: This function already has 8 parameters, well past the >3-4 params guideline. Adding a 9th is not acceptable. The correct approach is to bundle escape info into an existing config struct. Options:
- (a) Extend AimsPipelineConfig: The AIMS pipeline already has AimsPipelineConfig (in ori_arc/src/pipeline/aims_pipeline.rs:43). Add escape_info: Option<&EscapeInfo> to it. AimsPipelineConfig is currently pub(crate), but that is not a blocker if the config remains constructed and consumed entirely inside ori_arc; only a cross-crate construction path would require a visibility change.
- (b) Pass via ReprPlan: Since ReprPlan already stores escape info (§01.2), pass &ReprPlan instead of &EscapeInfo directly.
- Recommendation: option (b) — ReprPlan is the natural carrier for all representation decisions including escape info. This also avoids the visibility issue with AimsPipelineConfig.
In AimsPipelineConfig (or the AIMS emit_rc path in aims/emit_rc/):
- Skip RcInc/RcDec emission for variables whose allocation site is NoEscape in EscapeInfo
- Stack-promoted allocations have no RC header → all RC ops are no-ops
In aims/emit_reuse/ (reset/reuse detection):
- Stack-promoted values are always uniquely owned → always eligible for in-place reuse
/tpr-review passed — independent review found no critical or major issues (or all findings triaged)
/impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.
Subsection close-out (08.5) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.5 retrospective — build/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.5: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.
/sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.
Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.

08.6 Completion Checklist

Test matrix for §08 (write failing tests FIRST, verify they fail, then implement):

Allocation pattern	Expected escape state	Expected allocation strategy	Semantic pin
`let x = Point { x: 1, y: 2 }; x.x + x.y`	`NoEscape`	Stack (alloca)	Yes — zero `ori_rc_alloc`
`let list = [1, 2, 3]; len(list)`	`NoEscape`	Bump (dynamic)	Yes — zero `ori_rc_alloc`
`let s = "hello"; print(s)` where `print` borrows	`ArgEscape`	Heap (normal)	Yes — 1 `ori_rc_alloc`
`fn make_point() -> Point { Point { x: 1, y: 2 } }`	`GlobalEscape`	Heap	Yes — caller owns result
`let closure = \|x\| x + 1` (no captures)	`ArgEscape`	Heap	Test captures correctly
`chan.send(value)`	`GlobalEscape`	Heap	Yes — thread boundary
Recursive struct `Node { value: int, next: Option<Node> }`	`GlobalEscape`	Heap	Yes — recursive
`let pair = (1, 2); pair.0`	`NoEscape`	Stack	Yes — tuple on stack

Exit Criteria: A function that creates a temporary list, computes its length, and returns the length generates ZERO ori_rc_alloc/ori_rc_dec calls in LLVM IR. Verified by grep -c "ori_rc" function.ll returning 0. A function with a dynamic-size temporary collection uses ori_bump_alloc instead of ori_rc_alloc. Valgrind reports 0 heap leaks (bump regions properly freed).

08.R Third Party Review Findings

[TPR-08-001][major] section-08-escape-analysis.md:223-277 — Bump allocation (§08.4) is a separate runtime subsystem (~500+ LOC across 3 crates) embedded as a 56-line subsection. §08.4 proposes new runtime functions (ori_bump_alloc, ori_bump_free), linked-list region growth, LLVM prologue/epilogue emission, COW interaction (“always StaticUnique”), and integration with existing RC infrastructure. This requires coordinated changes to ori_rt, ori_repr, and ori_llvm. The plan rates §08 as “VERY HIGH COMPLEXITY” and recommends shipping §08.1 first — but §08.4 is not separated into its own section or formally deferred. Action: Extract §08.4 into a standalone section (§08b) with its own completion checklist, exit criteria, and line estimates, OR explicitly defer bump allocation to a future plan and remove it from §08’s exit criteria. The phased recommendation at line 54 is correct but should be formalized as a section boundary.