Section 08: Escape Analysis & Stack Promotion
Context: Ori’s ARC system (ori_arc) already has liveness analysis and borrow inference. The missing piece is escape analysis — determining whether a reference-counted value’s pointer is ever stored in a location that outlives the current function (return value, closure capture, global, or another heap object’s field).
The ori_arc pipeline currently uses liveness-based analysis for RC insertion, but this is conservative: it treats all heap-allocated values as potentially escaping. True escape analysis would identify values that can be stack-promoted.
Reference implementations:
- Go
cmd/compile/internal/escape/: Connection graph-based escape analysis — tracks dataflow from allocations to escape points - Swift
lib/SILOptimizer/Transforms/StackPromotion.cpp: Walks SIL to check if alloc_ref escapes - Java HotSpot
macro.cpp: Scalar replacement — replaces heap object with individual fields on stack - Lean4
Borrow.lean: Parameter ownership inference that implicitly identifies non-escaping borrows
Depends on: §02 (triviality classification helps escape analysis — trivial values don’t need escape tracking).
Risk warning (VERY HIGH COMPLEXITY): This is the largest (~1,500 lines) and most dangerous section. Key risks:
- Connection graph escape analysis is interprocedural — requires whole-module fixed-point iteration that interacts with
ori_arc’s existing borrow inference. - Stack promotion (§08.3) changes allocation semantics — a bug means use-after-free. Requires the most thorough Valgrind testing of any section.
- Bump allocation (§08.4) adds a new runtime allocation scheme to
ori_rtthat must integrate with existing COW, slice, and RC infrastructure. - §08.5 touches the
run_arc_pipeline()/ AIMS integration seam — the most sensitive function in the compiler.
Recommended approach: Implement §08.1 (intraprocedural) first as a standalone pass. Ship it, measure, and verify with Valgrind before attempting §08.2 (interprocedural) or §08.4 (bump allocation).
08.1 Intraprocedural Escape Analysis
File(s): compiler/ori_repr/src/escape/mod.rs, compiler/ori_repr/src/escape/intraprocedural.rs
Start with per-function analysis (no cross-function information). This catches the most common patterns: temporary collections, intermediate strings, local structs.
-
Define escape lattice:
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)] pub enum EscapeState { /// Value definitely does not escape the function NoEscape, /// Value escapes to a callee but callee doesn't retain it (borrow) ArgEscape, /// Value escapes the function (returned, stored in global, captured by closure) GlobalEscape, } -
Implement connection graph:
pub struct ConnectionGraph { /// Node per allocation site + parameter + return nodes: Vec<CgNode>, /// Edges: PointsTo (field → object), Deferred (alias) edges: Vec<CgEdge>, } pub enum CgNode { /// Allocation site (heap object creation) Alloc { id: AllocId, escape: EscapeState }, /// Function parameter (escape state depends on callers) Param { index: usize, escape: EscapeState }, /// Function return (always GlobalEscape) Return, /// Phantom node for unknown destinations Unknown, } pub enum CgEdge { /// a.field points to b PointsTo { from: NodeId, field: u32, to: NodeId }, /// a defers to b (alias — same object) Deferred { from: NodeId, to: NodeId }, } -
Implement escape propagation:
pub fn analyze_escapes(func: &ArcFunction, pool: &Pool) -> EscapeInfo { let mut graph = build_connection_graph(func, pool); // Fixed-point: propagate escape states through edges let mut changed = true; while changed { changed = false; for edge in &graph.edges { let (from_escape, to_escape) = match edge { PointsTo { from, to, .. } => (graph.escape(*from), graph.escape(*to)), Deferred { from, to } => (graph.escape(*from), graph.escape(*to)), }; // If destination escapes, source must also escape let merged = from_escape.max(to_escape); if merged > graph.escape(edge.source()) { graph.set_escape(edge.source(), merged); changed = true; } } } EscapeInfo::from_graph(graph) } -
Escape sources (what causes GlobalEscape):
- Value is returned from function
- Value is stored in a mutable reference parameter
- Value is captured by a closure that escapes
- Value is stored in a global variable
- Value is passed as an owned parameter to an unknown function
- Value is stored in a heap object’s field (that object escapes)
-
Non-escape sinks (what keeps NoEscape):
- Value is only read (not stored)
- Value is passed as a borrowed parameter to a known function
- Value is consumed (last use) within the function
- Value is used in pattern matching then discarded
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (08.1) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.1 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.1: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
08.2 Interprocedural Escape Analysis
File(s): compiler/ori_repr/src/escape/interprocedural.rs
Cross-function escape analysis uses function summaries to track which parameters escape.
-
Define function escape summaries:
pub struct FunctionEscapeSummary { /// Which parameters escape? pub param_escapes: Vec<EscapeState>, /// Does the return value contain any input parameters? pub return_aliases: Vec<usize>, // param indices } -
Compute summaries bottom-up through the call graph:
- Leaf functions (no callees): direct analysis
- Non-leaf functions: use callee summaries to refine escape states
- Recursive functions: conservative (assume all params escape) then refine
-
Apply summaries at call sites:
// At call site: f(x, y, z) // If f's summary says param 0 doesn't escape: // → x does NOT escape through this call // If f's summary says param 1 escapes: // → y DOES escape through this call -
Integration with ori_arc borrow inference:
- The borrow inference in
ori_arc::borrowalready computes per-parameter ownership (Borrowed/Owned) Borrowed≈ArgEscape(callee uses but doesn’t retain)Owned≈GlobalEscape(callee may retain)- Unify these two analyses to avoid redundant computation
- The borrow inference in
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (08.2) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.2 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.2: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
08.3 Stack Promotion Codegen
File(s): compiler/ori_llvm/src/codegen/arc_emitter/construction.rs (where ori_rc_alloc calls are emitted — replace with alloca for non-escaping values), compiler/ori_llvm/src/codegen/arc_emitter/alloc.rs (new file — allocation strategy dispatch)
When escape analysis marks an allocation as NoEscape, generate stack allocation instead of heap.
-
Replace
ori_rc_allocwithalloca:; Before (heap): %ptr = call ptr @ori_rc_alloc(i64 24, i64 8) ; ... use ptr ... call void @ori_rc_dec(ptr %ptr, ptr @_ori_drop$42) ; After (stack): %ptr = alloca [24 x i8], align 8 ; ... use ptr ... ; No rc_dec needed — stack memory is freed automatically -
Eliminate all RC operations for stack-promoted values:
- No
ori_rc_inc(refcount is meaningless on stack) - No
ori_rc_dec(no free needed) - No drop function call (fields are dropped individually at function exit)
- No
-
Handle non-trivial fields in stack-promoted values:
- If the struct has RC’d fields (e.g.,
struct { name: str, age: int }):- The struct itself is on the stack (no RC header)
- The
strfield still has its own RC (it’s a separate heap allocation) - At function exit,
ori_rc_decthe str field (but not the struct)
- If the struct has RC’d fields (e.g.,
-
Lifetime extension for stack-promoted values:
- If the value is live across a function call, the alloca must dominate the call
- LLVM’s alloca in the entry block is lifetime-safe
- Use
llvm.lifetime.start/llvm.lifetime.endintrinsics for precise scoping
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (08.3) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.3 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.3: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
08.4 Bump Allocation for Non-Escaping Dynamic Values
File(s): compiler/ori_repr/src/escape/bump.rs, compiler/ori_llvm/src/codegen/arc_emitter/alloc.rs
Stack promotion (§08.3) works for fixed-size values, but not for dynamic-size collections (lists, maps, strings with unknown length). These still need heap allocation — but they don’t need malloc/free if they don’t escape.
Strategy: Emit a function-local bump allocator for non-escaping dynamic values. Bump allocation is a pointer increment — faster than any general-purpose allocator. The entire bump region is freed at function return (single free or stack unwinding).
This directly closes the “custom allocators” gap with Zig: Zig lets programmers manually pass arena allocators. Ori does it automatically based on escape analysis.
-
Define bump allocation decision:
pub enum AllocStrategy { /// Standard heap allocation via ori_rc_alloc Heap, /// Stack allocation via alloca (fixed-size, NoEscape) Stack, /// Bump allocation from function-local arena (dynamic-size, NoEscape) Bump, } -
Select strategy based on escape state and size:
NoEscape+ fixed-size →Stack(§08.3)NoEscape+ dynamic-size →BumpArgEscapeorGlobalEscape→Heap
-
Emit bump allocator prologue/epilogue in LLVM IR:
; Function prologue — allocate bump region %bump.base = call ptr @ori_bump_alloc(i64 4096) ; initial 4KB region %bump.ptr = alloca ptr ; current bump pointer store ptr %bump.base, ptr %bump.ptr ; Bump allocation (instead of ori_rc_alloc): %current = load ptr, ptr %bump.ptr %next = getelementptr i8, ptr %current, i64 %size store ptr %next, ptr %bump.ptr ; %current is the allocated pointer — no RC header needed ; Function epilogue — free entire region call void @ori_bump_free(ptr %bump.base) -
Handle growth: if bump region is exhausted, allocate a new linked region. The
ori_bump_alloc/ori_bump_freefunctions inori_rtmanage a linked list of regions. -
No RC operations: Bump-allocated values have no refcount header. All RC inc/dec/is_unique operations are elided (same as stack-promoted values).
-
Interaction with COW: Bump-allocated collections are always unique (no sharing possible), so all COW checks are
StaticUnique— fast path only. This combines with VSO §07 for maximum effect. -
Unit tests:
- Function with temporary list (unknown size from input) → bump-allocated, no
ori_rc_alloc - Function returning list → standard heap allocation (escapes)
- Bump region growth: function allocating >4KB of temporaries → linked regions, single cleanup
- Function with temporary list (unknown size from input) → bump-allocated, no
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (08.4) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.4 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.4: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
08.5 Escape-Aware ARC Pipeline Integration
File(s): compiler/ori_arc/src/pipeline/aims_pipeline.rs (AimsPipelineConfig), compiler/ori_arc/src/aims/emit_rc/ (RC emission), compiler/ori_arc/src/lib.rs (pipeline entry)
Feed escape information into the ARC pipeline so it can skip RC operations.
Call site warning: run_arc_pipeline() is currently invoked directly from compiler/ori_arc/src/tests.rs and compiler/ori_llvm/src/codegen/function_compiler/define_phase.rs; run_arc_pipeline_all() in compiler/ori_arc/src/pipeline/mod.rs is the internal batch wrapper that must stay in sync. If this section changes the public signature, update every direct caller in the same commit.
-
Add
EscapeInfotorun_arc_pipeline()parameters:// Current signature (ori_arc/src/pipeline/mod.rs): pub fn run_arc_pipeline( func: &mut ArcFunction, classifier: &dyn ArcClassification, sigs: &FxHashMap<Name, AnnotatedSig>, pool: &Pool, interner: &ori_ir::StringInterner, uniqueness_summaries: &FxHashMap<Name, UniquenessSummary>, aims_contracts: &FxHashMap<Name, MemoryContract>, verify_arc: bool, // escape_info: &EscapeInfo, // NEW — adds a 9th parameter ) -> Vec<ArcProblem> { ... }WARNING: This function already has 8 parameters, well past the >3-4 params guideline. Adding a 9th is not acceptable. The correct approach is to bundle escape info into an existing config struct. Options:
- (a) Extend
AimsPipelineConfig: The AIMS pipeline already hasAimsPipelineConfig(inori_arc/src/pipeline/aims_pipeline.rs:43). Addescape_info: Option<&EscapeInfo>to it.AimsPipelineConfigis currentlypub(crate), but that is not a blocker if the config remains constructed and consumed entirely insideori_arc; only a cross-crate construction path would require a visibility change. - (b) Pass via
ReprPlan: SinceReprPlanalready stores escape info (§01.2), pass&ReprPlaninstead of&EscapeInfodirectly. - Recommendation: option (b) —
ReprPlanis the natural carrier for all representation decisions including escape info. This also avoids the visibility issue withAimsPipelineConfig.
- (a) Extend
-
In
AimsPipelineConfig(or the AIMS emit_rc path inaims/emit_rc/):- Skip
RcInc/RcDecemission for variables whose allocation site isNoEscapeinEscapeInfo - Stack-promoted allocations have no RC header → all RC ops are no-ops
- Skip
-
In
aims/emit_reuse/(reset/reuse detection):- Stack-promoted values are always uniquely owned → always eligible for in-place reuse
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (08.5) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-08.5 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 08.5: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
08.6 Completion Checklist
Test matrix for §08 (write failing tests FIRST, verify they fail, then implement):
| Allocation pattern | Expected escape state | Expected allocation strategy | Semantic pin |
|---|---|---|---|
let x = Point { x: 1, y: 2 }; x.x + x.y | NoEscape | Stack (alloca) | Yes — zero ori_rc_alloc |
let list = [1, 2, 3]; len(list) | NoEscape | Bump (dynamic) | Yes — zero ori_rc_alloc |
let s = "hello"; print(s) where print borrows | ArgEscape | Heap (normal) | Yes — 1 ori_rc_alloc |
fn make_point() -> Point { Point { x: 1, y: 2 } } | GlobalEscape | Heap | Yes — caller owns result |
let closure = |x| x + 1 (no captures) | ArgEscape | Heap | Test captures correctly |
chan.send(value) | GlobalEscape | Heap | Yes — thread boundary |
Recursive struct Node { value: int, next: Option<Node> } | GlobalEscape | Heap | Yes — recursive |
let pair = (1, 2); pair.0 | NoEscape | Stack | Yes — tuple on stack |
-
let list = [1, 2, 3]; len(list)— list is stack-promoted (no heap alloc) -
let s = str(42); print(s)— string is stack-promoted ifprintborrows -
let point = Point { x: 1, y: 2 }; point.x + point.y— struct on stack (no RC header) - Dynamic-size non-escaping collection → bump-allocated (no
ori_rc_alloc) - Bump-allocated values have no RC operations and all COW checks are
StaticUnique - Values returned from functions are NOT stack/bump-promoted (correctly identified as escaping)
- Closures that capture values correctly mark those values as escaping
-
./test-all.shgreen -
./clippy-all.shgreen -
./diagnostics/valgrind-aot.shclean (no use-after-free from premature stack deallocation or bump region reuse) -
./diagnostics/dual-exec-verify.shpasses (eval and AOT produce identical results) - Zero
ori_rc_alloccalls for functions that only use non-escaping values -
/tpr-reviewpassed — independent Codex review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — implementation hygiene review clean (phase boundaries, SSOT, algorithmic DRY, naming). MUST run AFTER/tpr-reviewis clean. -
/improve-toolingretrospective completed — MANDATORY at section close, after both reviews are clean. Reflect on the section’s debugging journey (whichdiagnostics/scripts you ran, which command sequences you repeated, where you added ad-hocdbg!/tracingcalls, where output was hard to interpret) and identify any tool/log/diagnostic improvement that would have made this section materially easier OR that would help the next section touching this area. Implement every accepted improvement NOW (zero deferral) and commit each via SEPARATE/commit-push. The retrospective is mandatory even when nothing felt painful — that is exactly when blind spots accumulate. See.claude/skills/improve-tooling/SKILL.md“Retrospective Mode” for the full protocol.
Exit Criteria: A function that creates a temporary list, computes its length, and returns the length generates ZERO ori_rc_alloc/ori_rc_dec calls in LLVM IR. Verified by grep -c "ori_rc" function.ll returning 0. A function with a dynamic-size temporary collection uses ori_bump_alloc instead of ori_rc_alloc. Valgrind reports 0 heap leaks (bump regions properly freed).
08.R Third Party Review Findings
-
[TPR-08-001][major]section-08-escape-analysis.md:223-277— Bump allocation (§08.4) is a separate runtime subsystem (~500+ LOC across 3 crates) embedded as a 56-line subsection. §08.4 proposes new runtime functions (ori_bump_alloc,ori_bump_free), linked-list region growth, LLVM prologue/epilogue emission, COW interaction (“always StaticUnique”), and integration with existing RC infrastructure. This requires coordinated changes toori_rt,ori_repr, andori_llvm. The plan rates §08 as “VERY HIGH COMPLEXITY” and recommends shipping §08.1 first — but §08.4 is not separated into its own section or formally deferred. Action: Extract §08.4 into a standalone section (§08b) with its own completion checklist, exit criteria, and line estimates, OR explicitly defer bump allocation to a future plan and remove it from §08’s exit criteria. The phased recommendation at line 54 is correct but should be formalized as a section boundary.