ARC IR

Why a Separate IR for Memory Management?

Compilers use intermediate representations to bridge the gap between what programmers write and what machines execute. Each IR is designed for a specific kind of analysis, and the canonical IR that enters the ARC system — a typed expression tree with implicit control flow — is the wrong shape for reference counting analysis.

Reference counting decisions depend on control flow: when does a variable’s last use occur? Does a value flow down both branches of an if/else? Is a value captured by a closure and thus outlives its declaring scope? Answering these questions requires explicit control flow — basic blocks, terminators, and a control flow graph where dataflow equations can be solved.

The classical solution, used by virtually every optimizing compiler, is to lower the high-level AST into a basic-block IR where control flow is explicit. LLVM has LLVM IR. Rust has MIR. GHC has STG and then Cmm. Lean 4 has LCNF (Lean Compiler Normal Form). Swift has SIL (Swift Intermediate Language). Each of these IRs shares the same fundamental structure: functions contain blocks, blocks contain sequential instructions and a terminator, and values are referenced by SSA-like variable IDs.

Ori’s ARC IR follows this pattern. It is lowered from the canonical IR (sugar-free expression tree) into basic blocks with explicit jumps, branches, and switches. Variables are SSA-like — each binding creates a fresh ArcVarId, and mutable variables are handled through block parameters at control flow merge points (the same mechanism as LLVM’s phi nodes, but using block arguments instead). This structure makes liveness analysis, borrow inference, RC insertion, and reuse detection straightforward applications of standard dataflow algorithms.

What Makes ARC IR Distinctive

ARC IR is not a general-purpose low-level IR like LLVM IR — it is a domain-specific IR for reference counting analysis. Every design decision serves this purpose:

  • Type classification is embedded: every variable carries its ArcClass (Scalar, DefiniteRef, PossibleRef), so RC passes never query the type system
  • Ownership annotations are first-class: function parameters carry Ownership, call arguments carry ArgOwnership
  • RC operations are explicit instructions: RcInc and RcDec are instructions with embedded RcStrategy, so the LLVM emitter pattern-matches directly without type queries
  • Reuse is a first-class concept: Reset/Reuse instructions model in-place allocation reuse as IR constructs, not ad-hoc pattern matching
  • Value representation is pre-computed: each variable’s ValueRepr (Scalar, RcPointer, Aggregate, FatValue) is computed once and stored in a parallel array

Function Structure

ArcFunction

A complete function body in ARC IR. Contains everything needed for all ARC analysis passes:

FieldTypePurpose
nameNameInterned function name
paramsVec<ArcParam>Parameters with ownership annotations
return_typeIdxReturn type (Pool index)
blocksVec<ArcBlock>Basic blocks in definition order
entryArcBlockIdEntry block ID
var_typesVec<Idx>Type per variable, indexed by ArcVarId
var_reprsVec<ValueRepr>Machine representation per variable (populated by compute_var_reprs)
cow_annotationsCowAnnotationsPer-COW operation mode (from uniqueness analysis)
drop_hintsDropHintsRcDec targets proven unique (skip full drop path)
is_fbipboolWhether #fbip annotation is present for reuse enforcement
num_capturesusizeLeading parameters that are closure captures (0 for top-level functions)

The var_reprs field starts empty after lowering and is populated by compute_var_reprs at the beginning of the ARC pipeline. This means the lowering phase does not need to know about machine-level representation — it works purely at the type level.

ArcBlock

A basic block: optional parameters for control flow merges, a sequential instruction body, and a terminator.

FieldTypePurpose
idArcBlockIdBlock identifier
paramsVec<(ArcVarId, Idx)>Block parameters (phi-node equivalent)
bodyVec<ArcInstr>Instructions in execution order
terminatorArcTerminatorHow control exits this block

Block parameters are ARC IR’s phi-node mechanism. When a mutable variable diverges across branches (if/else, match arms), the merge block receives both versions as block parameters, and each predecessor’s Jump terminator passes its version as an argument. This is the same approach used by MLIR and Swift SIL — structurally simpler than LLVM’s phi nodes and easier to manipulate during ARC passes.

ID Types

Both ArcVarId and ArcBlockId are #[repr(transparent)] newtypes over u32, allocated sequentially starting from 0. They provide raw() -> u32 and index() -> usize for indexing into parallel arrays (var_types, var_reprs). The newtype discipline prevents accidentally using a block ID as a variable ID or vice versa.

ArcParam

A function parameter annotated with ownership:

FieldTypePurpose
varArcVarIdVariable ID for this parameter
tyIdxType in the Pool
ownershipOwnershipOwned or Borrowed (refined by borrow inference)

All parameters start as Owned during lowering and may be refined to Borrowed by borrow inference. This “start conservative, refine” approach ensures correctness — an Owned parameter that should be Borrowed wastes RC operations but is correct, while a Borrowed parameter that should be Owned would be a use-after-free.

Instruction Set

ARC IR has 13 instruction variants, organized by purpose:

flowchart TB
    subgraph Core["Core Operations"]
        Let["Let — bind value"]
        Apply["Apply — direct call"]
        ApplyInd["ApplyIndirect — closure call"]
        Partial["PartialApply — capture"]
        Project["Project — field extract"]
        Construct["Construct — build value"]
        Select["Select — conditional value"]
    end

    subgraph RC["Reference Counting"]
        RcInc["RcInc — increment"]
        RcDec["RcDec — decrement + drop"]
        IsShared["IsShared — test RC > 1"]
    end

    subgraph Mutation["In-Place Mutation"]
        Set["Set — field write"]
        SetTag["SetTag — enum tag write"]
    end

    subgraph Reuse["Allocation Reuse"]
        Reset["Reset — mark for reuse"]
        ReuseI["Reuse — construct with token"]
    end

    classDef native fill:#5c3a1e,stroke:#f59e0b,color:#fef3c7

    class Let,Apply,ApplyInd,Partial,Project,Construct,Select,RcInc,RcDec,IsShared,Set,SetTag,Reset,ReuseI native

Core Operations

InstructionPurpose
Let { dst, ty, value }Bind an ArcValue (variable reference, literal, or primitive operation) to a new variable
Apply { dst, ty, func, args, arg_ownership }Direct function call — func is a Name, resolved at link time
ApplyIndirect { dst, ty, closure, args }Call through a closure’s fat pointer ({fn_ptr, env_ptr})
PartialApply { dst, ty, func, args }Capture arguments into a closure environment
Project { dst, ty, value, field }Extract field field (by index) from a struct, enum payload, or tuple
Construct { dst, ty, ctor, args }Build a composite value — structs, enum variants, tuples, collections, closures
Select { dst, ty, cond, true_val, false_val }Conditional value selection (maps directly to LLVM select)

Value-producing instructions return Some(dst) from defined_var(). The ArcValue type used in Let has three variants: Var(ArcVarId) for referencing an existing variable, Literal(LitValue) for constants, and PrimOp { op, args } for primitive arithmetic and comparison operations.

The CtorKind enum on Construct distinguishes: Struct(Name), EnumVariant { enum_name, variant }, Tuple, ListLiteral, MapLiteral, SetLiteral, and Closure { func }.

Reference Counting Operations

InstructionPurpose
RcInc { var, count, strategy }Increment reference count. count allows batched increments (e.g., a variable used in 3 places gets RcInc { count: 2 } — one use is “free”). strategy determines how to find the RC header.
RcDec { var, strategy }Decrement reference count. If the count reaches zero, invoke the drop function to free the allocation.
IsShared { dst, var }Test whether var’s reference count is greater than 1. Returns a boolean. Used as a guard before in-place mutation (COW) and reuse expansion.

These instructions are not present after lowering — they are inserted by the RC insertion pass (Perceus algorithm) based on liveness analysis and ownership information. The RcStrategy embedded in each instruction tells the LLVM emitter exactly how to emit the operation, with no further type queries needed.

In-Place Mutation

InstructionPurpose
Set { base, field, value }Write a value to a field of a struct or tuple. Only valid when the base is uniquely owned (guarded by IsShared).
SetTag { base, tag }Update the discriminant tag of an enum. Only valid when uniquely owned.

These instructions support COW (copy-on-write) semantics: before mutating, the compiler inserts an IsShared check. If the value is shared, the slow path copies it; if unique, Set/SetTag mutate in place.

Allocation Reuse

InstructionPurpose
Reset { var, token }Mark a value for potential in-place reuse. The token is an ArcVarId used to track the reuse opportunity.
Reuse { token, dst, ty, ctor, args }Construct a new value using the memory from a Reset token. If the reset value was uniquely owned, this is a zero-allocation construction.

Reset/Reuse are intermediate instructions — they are inserted by the reset/reuse detection pass and then expanded by the expansion pass into IsShared guards with fast-path (in-place) and slow-path (fresh allocation) branches. After expansion, no Reset or Reuse instructions remain in the IR.

Terminators

Every block ends with exactly one terminator:

TerminatorPurpose
Return { value }Return from the function
Jump { target, args }Unconditional jump, passing values as block parameters
Branch { cond, then_block, else_block }Conditional branch on a boolean
Switch { scrutinee, cases, default }Multi-way branch on an integer discriminant (from pattern match compilation)
Invoke { dst, ty, func, args, arg_ownership, normal, unwind }Function call with unwinding support — success continues to normal, panic transfers to unwind
ResumeResume unwinding after cleanup (re-raise a caught panic)
UnreachableMarks a block as provably unreachable (used after exhaustive match)

Terminators provide used_vars(), uses_var(target), and substitute_var(old, new) for liveness analysis, reuse expansion, and RC identity propagation.

Type Classification

The three-way ArcClass classification is the foundation of all RC decisions. Every downstream pass — borrow inference, RC insertion, reset/reuse, elimination — consults the classification to determine whether a variable needs RC operations at all.

flowchart TB
    Type["Type (Idx)"] --> Classify{"ArcClassifier"}

    Classify -->|"int, float, bool,
    char, byte, void,
    Duration, Size,
    Ordering"| Scalar["Scalar
    No RC needed"]

    Classify -->|"str, [T], {K:V},
    Set, Channel,
    (P) -> R"| Definite["DefiniteRef
    Always needs RC"]

    Classify -->|"Unresolved
    type variable"| Possible["PossibleRef
    Conservative fallback"]

    classDef native fill:#5c3a1e,stroke:#f59e0b,color:#fef3c7

    class Type,Classify,Scalar,Definite,Possible native

Classification is monomorphized — it operates on concrete types after type parameter substitution. PossibleRef should never appear after monomorphization; encountering it post-mono is a compiler bug.

Misclassification is catastrophic:

  • Classifying a DefiniteRef as Scalar means missing RC operations → use-after-free or memory leak
  • Classifying a Scalar as DefiniteRef means unnecessary RC operations → performance bug (but correct)

Transitive Classification

Compound types are classified by their children. A tuple (int, str) contains str (DefiniteRef), so the tuple itself is DefiniteRef — it must participate in RC because decrementing it must propagate to its str field. A tuple (int, bool) is Scalar — no fields need RC, so the tuple doesn’t either.

The rule is conservative: if any child is DefiniteRef, the compound is DefiniteRef. If any child is PossibleRef (and none is DefiniteRef), the compound is PossibleRef. Only if all children are Scalar is the compound Scalar.

Recursive types are detected via a cycle-detection set — if classification encounters an Idx already being classified, the type requires heap indirection and is DefiniteRef.

ArcClassifier

The ArcClassifier wraps a Pool reference with a memoization cache and cycle-detection set. It provides a fast path for pre-interned primitives (Pool indices 0-11): int through ordering are classified by raw index without hash map lookup, producing instant Scalar classification. str (index 11) is the exception — it classifies as DefiniteRef.

Iterators and DoubleEndedIterator are classified as Scalar despite being heap-allocated, because they are runtime-managed (allocated with Box::new, not ori_rc_alloc) and lack RC headers.

Value Representation

ValueRepr bridges the type-level ArcClass to machine-level layout. Computed once per variable by compute_var_reprs at the start of the pipeline, then stored in the var_reprs parallel array.

ReprLayoutExamples
ScalarRegister-width, no RCint, float, bool, char, byte, void
RcPointerSingle heap pointer[T], {K: V}, Set<T>, Channel<T>
AggregateMulti-field valueTuples, structs, enums, Option<T>, Result<T, E>
FatValueTwo-word: pointer + metadatastr (ptr + len), closures (fn_ptr + env_ptr)

The derivation depends on both ArcClass and the Pool tag: Scalar classes always produce Scalar repr. For ref-containing classes, Tag::Str and Tag::Function produce FatValue, Tag::Tuple/Struct/Enum/Result/Option produce Aggregate, and collection types produce RcPointer.

RC Strategy

RcStrategy refines ValueRepr specifically for RC operations — it encodes exactly how to find and manipulate the reference count for a given value shape. It is computed during RC insertion and embedded in the RcInc/RcDec instructions themselves, so the LLVM emitter pattern-matches directly without any Pool queries at emission time.

StrategyHow It Works
HeapPointerSingle pointer — call ori_rc_inc(ptr) / ori_rc_dec(ptr, drop_fn) directly
FatPointerTwo-word value (e.g., str) — extract field 1 (data pointer), then inc/dec
ClosureFat pointer with nullable env — extract env_ptr, null-check (bare functions have no env), then inc/dec
AggregateFieldsMulti-field struct/tuple — traverse RC-containing fields and inc/dec each recursively
InlineEnumStack-allocated enum — tag-switch, then per-variant field handling

The key design insight: RcStrategy is computed once and never recomputed. It is the final answer to “how do you manage RC for this variable?” — all downstream consumers (LLVM emitter, reuse expansion, drop generation) read it directly from the instruction.

ARC IR vs Canonical IR

PropertyCanonical IRARC IR
Control flowImplicit (nested expressions)Explicit basic blocks with terminators
NamesScoped lexical names (Name)SSA variables (ArcVarId)
Merge pointsExpression nestingBlock parameters on Jump
Mutable variablesRebinding in scopeFresh ArcVarId per assignment; merge via block params
Function callsNested expressionApply (direct), ApplyIndirect (closure), Invoke (may-unwind)
RC operationsNone (implicit in value semantics)Explicit RcInc/RcDec with strategy
ReuseNoneReset/Reuse intermediates
TypesPer-expression via arenaParallel var_types array indexed by ArcVarId
OwnershipNonePer-parameter Ownership, per-argument ArgOwnership

Prior Art

Lean 4’s LCNF (Lean Compiler Normal Form) is the closest analog to Ori’s ARC IR. LCNF is a basic-block IR with explicit let bindings, function applications, projections, and constructors — essentially the same instruction set as ARC IR. Lean’s RC operations (inc, dec, reset, reuse) are also explicit instructions. The key structural difference is that Lean’s IR is expression-oriented with explicit join points, while Ori uses block parameters for phi-like merges.

Swift’s SIL (Swift Intermediate Language) is a similar basic-block IR with explicit strong_retain and strong_release instructions. SIL is more complex than ARC IR because it supports Swift’s full ownership model (borrowing, move-only types, coroutines), while ARC IR is focused specifically on RC optimization for value semantics. SIL’s type classification is implicit in its ownership rules, while ARC IR’s ArcClass is an explicit three-way enum.

Rust’s MIR (Mid-level IR) shares the basic-block structure but has no RC instructions — Rust uses ownership and drop glue instead. MIR uses Place and Rvalue instead of SSA variables, making it more suited to Rust’s borrow-checker semantics. ARC IR’s SSA-like structure is simpler for the dataflow analyses it needs.

Design Tradeoffs

Block parameters vs phi nodes. ARC IR uses block parameters on Jump terminators for merge points, rather than LLVM-style phi nodes at the start of blocks. Block parameters are easier to manipulate during ARC passes (adding a parameter doesn’t require updating phi nodes in other blocks) and map naturally to function arguments during lowering. The tradeoff is that some LLVM optimizations are designed around phi nodes, so the ARC→LLVM translation must convert between the two representations.

Embedded strategy vs late binding. The RcStrategy is computed during RC insertion and embedded in every RcInc/RcDec instruction. The alternative — computing strategy at emission time from the variable’s type — would keep the IR simpler but force the LLVM emitter to query the Pool for every RC operation. Embedding the strategy makes the emitter a tight, stateless loop over instructions, at the cost of a slightly larger IR.

Three-way classification vs two-way. The PossibleRef class exists for pre-monomorphization conservative analysis. A simpler two-way system (Scalar vs Ref) would eliminate the third case but require monomorphization to complete before any ARC analysis can begin. The three-way system allows ARC analysis to start earlier, though in practice monomorphization happens first and PossibleRef rarely appears.

Domain-specific IR vs reusing LLVM IR. ARC analysis could theoretically be done on LLVM IR directly, using LLVM’s own SSA variables and basic blocks. But LLVM IR lacks the semantic information that ARC analysis needs — constructors, projections, ownership annotations, reuse tokens. A domain-specific IR makes these concepts first-class, at the cost of an additional lowering step (CanExpr → ARC IR → LLVM IR instead of CanExpr → LLVM IR directly).