ARC IR
Why a Separate IR for Memory Management?
Compilers use intermediate representations to bridge the gap between what programmers write and what machines execute. Each IR is designed for a specific kind of analysis, and the canonical IR that enters the ARC system — a typed expression tree with implicit control flow — is the wrong shape for reference counting analysis.
Reference counting decisions depend on control flow: when does a variable’s last use occur? Does a value flow down both branches of an if/else? Is a value captured by a closure and thus outlives its declaring scope? Answering these questions requires explicit control flow — basic blocks, terminators, and a control flow graph where dataflow equations can be solved.
The classical solution, used by virtually every optimizing compiler, is to lower the high-level AST into a basic-block IR where control flow is explicit. LLVM has LLVM IR. Rust has MIR. GHC has STG and then Cmm. Lean 4 has LCNF (Lean Compiler Normal Form). Swift has SIL (Swift Intermediate Language). Each of these IRs shares the same fundamental structure: functions contain blocks, blocks contain sequential instructions and a terminator, and values are referenced by SSA-like variable IDs.
Ori’s ARC IR follows this pattern. It is lowered from the canonical IR (sugar-free expression tree) into basic blocks with explicit jumps, branches, and switches. Variables are SSA-like — each binding creates a fresh ArcVarId, and mutable variables are handled through block parameters at control flow merge points (the same mechanism as LLVM’s phi nodes, but using block arguments instead). This structure makes liveness analysis, borrow inference, RC insertion, and reuse detection straightforward applications of standard dataflow algorithms.
What Makes ARC IR Distinctive
ARC IR is not a general-purpose low-level IR like LLVM IR — it is a domain-specific IR for reference counting analysis. Every design decision serves this purpose:
- Type classification is embedded: every variable carries its
ArcClass(Scalar, DefiniteRef, PossibleRef), so RC passes never query the type system - Ownership annotations are first-class: function parameters carry
Ownership, call arguments carryArgOwnership - RC operations are explicit instructions:
RcIncandRcDecare instructions with embeddedRcStrategy, so the LLVM emitter pattern-matches directly without type queries - Reuse is a first-class concept:
Reset/Reuseinstructions model in-place allocation reuse as IR constructs, not ad-hoc pattern matching - Value representation is pre-computed: each variable’s
ValueRepr(Scalar, RcPointer, Aggregate, FatValue) is computed once and stored in a parallel array
Function Structure
ArcFunction
A complete function body in ARC IR. Contains everything needed for all ARC analysis passes:
| Field | Type | Purpose |
|---|---|---|
name | Name | Interned function name |
params | Vec<ArcParam> | Parameters with ownership annotations |
return_type | Idx | Return type (Pool index) |
blocks | Vec<ArcBlock> | Basic blocks in definition order |
entry | ArcBlockId | Entry block ID |
var_types | Vec<Idx> | Type per variable, indexed by ArcVarId |
var_reprs | Vec<ValueRepr> | Machine representation per variable (populated by compute_var_reprs) |
cow_annotations | CowAnnotations | Per-COW operation mode (from uniqueness analysis) |
drop_hints | DropHints | RcDec targets proven unique (skip full drop path) |
is_fbip | bool | Whether #fbip annotation is present for reuse enforcement |
num_captures | usize | Leading parameters that are closure captures (0 for top-level functions) |
The var_reprs field starts empty after lowering and is populated by compute_var_reprs at the beginning of the ARC pipeline. This means the lowering phase does not need to know about machine-level representation — it works purely at the type level.
ArcBlock
A basic block: optional parameters for control flow merges, a sequential instruction body, and a terminator.
| Field | Type | Purpose |
|---|---|---|
id | ArcBlockId | Block identifier |
params | Vec<(ArcVarId, Idx)> | Block parameters (phi-node equivalent) |
body | Vec<ArcInstr> | Instructions in execution order |
terminator | ArcTerminator | How control exits this block |
Block parameters are ARC IR’s phi-node mechanism. When a mutable variable diverges across branches (if/else, match arms), the merge block receives both versions as block parameters, and each predecessor’s Jump terminator passes its version as an argument. This is the same approach used by MLIR and Swift SIL — structurally simpler than LLVM’s phi nodes and easier to manipulate during ARC passes.
ID Types
Both ArcVarId and ArcBlockId are #[repr(transparent)] newtypes over u32, allocated sequentially starting from 0. They provide raw() -> u32 and index() -> usize for indexing into parallel arrays (var_types, var_reprs). The newtype discipline prevents accidentally using a block ID as a variable ID or vice versa.
ArcParam
A function parameter annotated with ownership:
| Field | Type | Purpose |
|---|---|---|
var | ArcVarId | Variable ID for this parameter |
ty | Idx | Type in the Pool |
ownership | Ownership | Owned or Borrowed (refined by borrow inference) |
All parameters start as Owned during lowering and may be refined to Borrowed by borrow inference. This “start conservative, refine” approach ensures correctness — an Owned parameter that should be Borrowed wastes RC operations but is correct, while a Borrowed parameter that should be Owned would be a use-after-free.
Instruction Set
ARC IR has 13 instruction variants, organized by purpose:
flowchart TB
subgraph Core["Core Operations"]
Let["Let — bind value"]
Apply["Apply — direct call"]
ApplyInd["ApplyIndirect — closure call"]
Partial["PartialApply — capture"]
Project["Project — field extract"]
Construct["Construct — build value"]
Select["Select — conditional value"]
end
subgraph RC["Reference Counting"]
RcInc["RcInc — increment"]
RcDec["RcDec — decrement + drop"]
IsShared["IsShared — test RC > 1"]
end
subgraph Mutation["In-Place Mutation"]
Set["Set — field write"]
SetTag["SetTag — enum tag write"]
end
subgraph Reuse["Allocation Reuse"]
Reset["Reset — mark for reuse"]
ReuseI["Reuse — construct with token"]
end
classDef native fill:#5c3a1e,stroke:#f59e0b,color:#fef3c7
class Let,Apply,ApplyInd,Partial,Project,Construct,Select,RcInc,RcDec,IsShared,Set,SetTag,Reset,ReuseI native
Core Operations
| Instruction | Purpose |
|---|---|
Let { dst, ty, value } | Bind an ArcValue (variable reference, literal, or primitive operation) to a new variable |
Apply { dst, ty, func, args, arg_ownership } | Direct function call — func is a Name, resolved at link time |
ApplyIndirect { dst, ty, closure, args } | Call through a closure’s fat pointer ({fn_ptr, env_ptr}) |
PartialApply { dst, ty, func, args } | Capture arguments into a closure environment |
Project { dst, ty, value, field } | Extract field field (by index) from a struct, enum payload, or tuple |
Construct { dst, ty, ctor, args } | Build a composite value — structs, enum variants, tuples, collections, closures |
Select { dst, ty, cond, true_val, false_val } | Conditional value selection (maps directly to LLVM select) |
Value-producing instructions return Some(dst) from defined_var(). The ArcValue type used in Let has three variants: Var(ArcVarId) for referencing an existing variable, Literal(LitValue) for constants, and PrimOp { op, args } for primitive arithmetic and comparison operations.
The CtorKind enum on Construct distinguishes: Struct(Name), EnumVariant { enum_name, variant }, Tuple, ListLiteral, MapLiteral, SetLiteral, and Closure { func }.
Reference Counting Operations
| Instruction | Purpose |
|---|---|
RcInc { var, count, strategy } | Increment reference count. count allows batched increments (e.g., a variable used in 3 places gets RcInc { count: 2 } — one use is “free”). strategy determines how to find the RC header. |
RcDec { var, strategy } | Decrement reference count. If the count reaches zero, invoke the drop function to free the allocation. |
IsShared { dst, var } | Test whether var’s reference count is greater than 1. Returns a boolean. Used as a guard before in-place mutation (COW) and reuse expansion. |
These instructions are not present after lowering — they are inserted by the RC insertion pass (Perceus algorithm) based on liveness analysis and ownership information. The RcStrategy embedded in each instruction tells the LLVM emitter exactly how to emit the operation, with no further type queries needed.
In-Place Mutation
| Instruction | Purpose |
|---|---|
Set { base, field, value } | Write a value to a field of a struct or tuple. Only valid when the base is uniquely owned (guarded by IsShared). |
SetTag { base, tag } | Update the discriminant tag of an enum. Only valid when uniquely owned. |
These instructions support COW (copy-on-write) semantics: before mutating, the compiler inserts an IsShared check. If the value is shared, the slow path copies it; if unique, Set/SetTag mutate in place.
Allocation Reuse
| Instruction | Purpose |
|---|---|
Reset { var, token } | Mark a value for potential in-place reuse. The token is an ArcVarId used to track the reuse opportunity. |
Reuse { token, dst, ty, ctor, args } | Construct a new value using the memory from a Reset token. If the reset value was uniquely owned, this is a zero-allocation construction. |
Reset/Reuse are intermediate instructions — they are inserted by the reset/reuse detection pass and then expanded by the expansion pass into IsShared guards with fast-path (in-place) and slow-path (fresh allocation) branches. After expansion, no Reset or Reuse instructions remain in the IR.
Terminators
Every block ends with exactly one terminator:
| Terminator | Purpose |
|---|---|
Return { value } | Return from the function |
Jump { target, args } | Unconditional jump, passing values as block parameters |
Branch { cond, then_block, else_block } | Conditional branch on a boolean |
Switch { scrutinee, cases, default } | Multi-way branch on an integer discriminant (from pattern match compilation) |
Invoke { dst, ty, func, args, arg_ownership, normal, unwind } | Function call with unwinding support — success continues to normal, panic transfers to unwind |
Resume | Resume unwinding after cleanup (re-raise a caught panic) |
Unreachable | Marks a block as provably unreachable (used after exhaustive match) |
Terminators provide used_vars(), uses_var(target), and substitute_var(old, new) for liveness analysis, reuse expansion, and RC identity propagation.
Type Classification
The three-way ArcClass classification is the foundation of all RC decisions. Every downstream pass — borrow inference, RC insertion, reset/reuse, elimination — consults the classification to determine whether a variable needs RC operations at all.
flowchart TB
Type["Type (Idx)"] --> Classify{"ArcClassifier"}
Classify -->|"int, float, bool,
char, byte, void,
Duration, Size,
Ordering"| Scalar["Scalar
No RC needed"]
Classify -->|"str, [T], {K:V},
Set, Channel,
(P) -> R"| Definite["DefiniteRef
Always needs RC"]
Classify -->|"Unresolved
type variable"| Possible["PossibleRef
Conservative fallback"]
classDef native fill:#5c3a1e,stroke:#f59e0b,color:#fef3c7
class Type,Classify,Scalar,Definite,Possible native
Classification is monomorphized — it operates on concrete types after type parameter substitution. PossibleRef should never appear after monomorphization; encountering it post-mono is a compiler bug.
Misclassification is catastrophic:
- Classifying a
DefiniteRefasScalarmeans missing RC operations → use-after-free or memory leak - Classifying a
ScalarasDefiniteRefmeans unnecessary RC operations → performance bug (but correct)
Transitive Classification
Compound types are classified by their children. A tuple (int, str) contains str (DefiniteRef), so the tuple itself is DefiniteRef — it must participate in RC because decrementing it must propagate to its str field. A tuple (int, bool) is Scalar — no fields need RC, so the tuple doesn’t either.
The rule is conservative: if any child is DefiniteRef, the compound is DefiniteRef. If any child is PossibleRef (and none is DefiniteRef), the compound is PossibleRef. Only if all children are Scalar is the compound Scalar.
Recursive types are detected via a cycle-detection set — if classification encounters an Idx already being classified, the type requires heap indirection and is DefiniteRef.
ArcClassifier
The ArcClassifier wraps a Pool reference with a memoization cache and cycle-detection set. It provides a fast path for pre-interned primitives (Pool indices 0-11): int through ordering are classified by raw index without hash map lookup, producing instant Scalar classification. str (index 11) is the exception — it classifies as DefiniteRef.
Iterators and DoubleEndedIterator are classified as Scalar despite being heap-allocated, because they are runtime-managed (allocated with Box::new, not ori_rc_alloc) and lack RC headers.
Value Representation
ValueRepr bridges the type-level ArcClass to machine-level layout. Computed once per variable by compute_var_reprs at the start of the pipeline, then stored in the var_reprs parallel array.
| Repr | Layout | Examples |
|---|---|---|
Scalar | Register-width, no RC | int, float, bool, char, byte, void |
RcPointer | Single heap pointer | [T], {K: V}, Set<T>, Channel<T> |
Aggregate | Multi-field value | Tuples, structs, enums, Option<T>, Result<T, E> |
FatValue | Two-word: pointer + metadata | str (ptr + len), closures (fn_ptr + env_ptr) |
The derivation depends on both ArcClass and the Pool tag: Scalar classes always produce Scalar repr. For ref-containing classes, Tag::Str and Tag::Function produce FatValue, Tag::Tuple/Struct/Enum/Result/Option produce Aggregate, and collection types produce RcPointer.
RC Strategy
RcStrategy refines ValueRepr specifically for RC operations — it encodes exactly how to find and manipulate the reference count for a given value shape. It is computed during RC insertion and embedded in the RcInc/RcDec instructions themselves, so the LLVM emitter pattern-matches directly without any Pool queries at emission time.
| Strategy | How It Works |
|---|---|
HeapPointer | Single pointer — call ori_rc_inc(ptr) / ori_rc_dec(ptr, drop_fn) directly |
FatPointer | Two-word value (e.g., str) — extract field 1 (data pointer), then inc/dec |
Closure | Fat pointer with nullable env — extract env_ptr, null-check (bare functions have no env), then inc/dec |
AggregateFields | Multi-field struct/tuple — traverse RC-containing fields and inc/dec each recursively |
InlineEnum | Stack-allocated enum — tag-switch, then per-variant field handling |
The key design insight: RcStrategy is computed once and never recomputed. It is the final answer to “how do you manage RC for this variable?” — all downstream consumers (LLVM emitter, reuse expansion, drop generation) read it directly from the instruction.
ARC IR vs Canonical IR
| Property | Canonical IR | ARC IR |
|---|---|---|
| Control flow | Implicit (nested expressions) | Explicit basic blocks with terminators |
| Names | Scoped lexical names (Name) | SSA variables (ArcVarId) |
| Merge points | Expression nesting | Block parameters on Jump |
| Mutable variables | Rebinding in scope | Fresh ArcVarId per assignment; merge via block params |
| Function calls | Nested expression | Apply (direct), ApplyIndirect (closure), Invoke (may-unwind) |
| RC operations | None (implicit in value semantics) | Explicit RcInc/RcDec with strategy |
| Reuse | None | Reset/Reuse intermediates |
| Types | Per-expression via arena | Parallel var_types array indexed by ArcVarId |
| Ownership | None | Per-parameter Ownership, per-argument ArgOwnership |
Prior Art
Lean 4’s LCNF (Lean Compiler Normal Form) is the closest analog to Ori’s ARC IR. LCNF is a basic-block IR with explicit let bindings, function applications, projections, and constructors — essentially the same instruction set as ARC IR. Lean’s RC operations (inc, dec, reset, reuse) are also explicit instructions. The key structural difference is that Lean’s IR is expression-oriented with explicit join points, while Ori uses block parameters for phi-like merges.
Swift’s SIL (Swift Intermediate Language) is a similar basic-block IR with explicit strong_retain and strong_release instructions. SIL is more complex than ARC IR because it supports Swift’s full ownership model (borrowing, move-only types, coroutines), while ARC IR is focused specifically on RC optimization for value semantics. SIL’s type classification is implicit in its ownership rules, while ARC IR’s ArcClass is an explicit three-way enum.
Rust’s MIR (Mid-level IR) shares the basic-block structure but has no RC instructions — Rust uses ownership and drop glue instead. MIR uses Place and Rvalue instead of SSA variables, making it more suited to Rust’s borrow-checker semantics. ARC IR’s SSA-like structure is simpler for the dataflow analyses it needs.
Design Tradeoffs
Block parameters vs phi nodes. ARC IR uses block parameters on Jump terminators for merge points, rather than LLVM-style phi nodes at the start of blocks. Block parameters are easier to manipulate during ARC passes (adding a parameter doesn’t require updating phi nodes in other blocks) and map naturally to function arguments during lowering. The tradeoff is that some LLVM optimizations are designed around phi nodes, so the ARC→LLVM translation must convert between the two representations.
Embedded strategy vs late binding. The RcStrategy is computed during RC insertion and embedded in every RcInc/RcDec instruction. The alternative — computing strategy at emission time from the variable’s type — would keep the IR simpler but force the LLVM emitter to query the Pool for every RC operation. Embedding the strategy makes the emitter a tight, stateless loop over instructions, at the cost of a slightly larger IR.
Three-way classification vs two-way. The PossibleRef class exists for pre-monomorphization conservative analysis. A simpler two-way system (Scalar vs Ref) would eliminate the third case but require monomorphization to complete before any ARC analysis can begin. The three-way system allows ARC analysis to start earlier, though in practice monomorphization happens first and PossibleRef rarely appears.
Domain-specific IR vs reusing LLVM IR. ARC analysis could theoretically be done on LLVM IR directly, using LLVM’s own SSA variables and basic blocks. But LLVM IR lacks the semantic information that ARC analysis needs — constructors, projections, ownership annotations, reuse tokens. A domain-specific IR makes these concepts first-class, at the cost of an additional lowering step (CanExpr → ARC IR → LLVM IR instead of CanExpr → LLVM IR directly).