Appendix B: Memory Management

Name: Ori
Author: upstat-io

Memory management strategies used in the Ori compiler.

Stack Safety

Deeply nested expressions can cause stack overflow in recursive parsing, type-checking, and evaluation. The compiler uses the stacker crate to dynamically grow the stack when needed:

// src/stack.rs
const RED_ZONE: usize = 100 * 1024;     // 100KB minimum remaining
const STACK_PER_RECURSION: usize = 1024 * 1024;  // 1MB growth

#[inline]
pub fn ensure_sufficient_stack<R>(f: impl FnOnce() -> R) -> R {
    stacker::maybe_grow(RED_ZONE, STACK_PER_RECURSION, f)
}

Applied to recursive functions:

// Parser
pub fn parse_expr(&mut self) -> Result<ExprId, ParseError> {
    ensure_sufficient_stack(|| self.parse_expr_inner())
}

// Type checker
pub fn infer_expr(checker: &mut TypeChecker<'_>, expr_id: ExprId) -> Type {
    ensure_sufficient_stack(|| infer_expr_inner(checker, expr_id))
}

// Evaluator
pub fn eval(&mut self, expr_id: ExprId) -> EvalResult {
    ensure_sufficient_stack(|| self.eval_inner(expr_id))
}

This prevents stack overflow on deeply nested code like ((((((((((1)))))))))) while adding minimal overhead to normal execution.

Arena Allocation

Expressions use arena allocation:

pub struct ExprArena {
    exprs: Vec<Expr>,
}

impl ExprArena {
    pub fn alloc(&mut self, expr: Expr) -> ExprId {
        let id = ExprId(self.exprs.len() as u32);
        self.exprs.push(expr);
        id
    }
}

Benefits:

Contiguous memory (cache-friendly)
No individual deallocations
Simple lifetime management

String Interning

All identifiers are interned:

pub struct Interner {
    strings: Vec<String>,
    lookup: HashMap<String, Name>,
}

Memory savings:

“foo” appears 100 times → stored once
Name is 4 bytes vs String’s ~24 bytes

Arc for Shared Values

Runtime values use Arc for sharing:

pub enum Value {
    String(Arc<String>),
    List(Arc<Vec<Value>>),
    // ...
}

Why Arc:

Closures capture environment by cloning
Multiple references to same list
Safe concurrent access

SharedRegistry vs SharedMutableRegistry

The compiler uses two registry patterns:

SharedRegistry (Immutable)

For registries that are built completely before use:

pub struct SharedRegistry<T>(Arc<T>);

impl<T> SharedRegistry<T> {
    pub fn new(registry: T) -> Self {
        SharedRegistry(Arc::new(registry))
    }
}

Use when:

Registry is fully populated before access
No modifications needed after construction
Salsa query compatibility required

SharedMutableRegistry (Interior Mutability)

For registries that need modification after dependent structures are built:

pub struct SharedMutableRegistry<T>(Arc<parking_lot::RwLock<T>>);

impl<T> SharedMutableRegistry<T> {
    pub fn new(registry: T) -> Self {
        SharedMutableRegistry(Arc::new(parking_lot::RwLock::new(registry)))
    }

    pub fn read(&self) -> RwLockReadGuard<'_, T> {
        self.0.read()
    }

    pub fn write(&self) -> RwLockWriteGuard<'_, T> {
        self.0.write()
    }
}

Use when:

Need to add entries after construction
Dependent structures (like cached dispatchers) must see updates
Acceptable trade-off: RwLock overhead vs rebuilding cost

Example: Method Dispatch Caching

The MethodDispatcher is cached in the Evaluator to avoid rebuilding the resolver chain on every method call. However, load_module() registers new methods after the Evaluator is constructed. Using SharedMutableRegistry<UserMethodRegistry>:

// In EvaluatorBuilder::build():
let user_method_registry = SharedMutableRegistry::new(UserMethodRegistry::new());
let method_dispatcher = MethodDispatcher::new(vec![
    Box::new(UserMethodResolver::new(user_method_registry.clone())),
    // ... other resolvers
]);

// In load_module():
self.user_method_registry.write().merge(new_methods);

// In method resolution:
if let Some(method) = self.registry.read().lookup(type_name, method_name) { ... }

This avoids 4 Box allocations per method call while still allowing dynamic method registration.

Heap Wrapper

Ensures consistent allocation:

pub struct Heap<T>(Arc<T>);

impl<T> Heap<T> {
    pub fn new(value: T) -> Self {
        Self(Arc::new(value))
    }
}

Prevents:

Accidental bare Arc creation
Inconsistent allocation patterns

Copy Types

Small types are Copy:

#[derive(Clone, Copy)]
pub struct ExprId(u32);

#[derive(Clone, Copy)]
pub struct Name(u32);

#[derive(Clone, Copy)]
pub struct Span { start: u32, end: u32 }

Benefits:

No heap allocation
Trivial to pass around
No lifetime complications

MethodKey

Type-safe key for method registry lookups:

#[derive(Clone, Eq, PartialEq, Hash, Debug)]
pub struct MethodKey {
    pub type_name: String,
    pub method_name: String,
}

impl MethodKey {
    pub fn new(type_name: impl Into<String>, method_name: impl Into<String>) -> Self;
    pub fn from_strs(type_name: &str, method_name: &str) -> Self;
}

impl Display for MethodKey {
    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
        write!(f, "{}::{}", self.type_name, self.method_name)
    }
}

Benefits:

Type-safe method lookups (vs tuple of strings)
Better error messages (Point::distance instead of ("Point", "distance"))
Hashable for use in registries

Token Storage

Tokens stored in parallel arrays:

pub struct TokenList {
    kinds: Vec<TokenKind>,
    spans: Vec<Span>,
}

Better than Vec<Token> because:

TokenKind often accessed without span
Better memory locality for iteration

Module Caching

Evaluated modules are cached:

pub struct ModuleCache {
    cache: HashMap<PathBuf, ModuleEvalResult>,
}

Prevents:

Re-evaluating same module
Memory bloat from duplicates

Scope Cleanup

Scopes are cleaned up immediately:

fn eval_let(&mut self, name: Name, value: ExprId, body: ExprId) -> Result<Value, EvalError> {
    let value = self.eval_expr(value)?;

    self.env.push_scope();
    self.env.bind(name, value);

    let result = self.eval_expr(body);

    self.env.pop_scope();  // Immediate cleanup
    result
}

Type Representation

Types avoid excessive boxing:

// Primitives are inline
Type::Int
Type::Bool

// Compound types box only where needed
Type::List(Box<Type>)  // One allocation
Type::Function { params: Vec<Type>, ret: Box<Type> }

Memory Profiling

For large programs:

# Run with memory profiler
ORI_PROFILE_MEMORY=1 ori run large_file.ori

# Output
Arena: 1.2 MB (12,000 expressions)
Interner: 0.3 MB (5,000 strings)
Values: 2.1 MB
Total: 3.6 MB

Performance Annotations

The compiler uses Rust attributes to help the optimizer:

`#[inline]`

Used on small, frequently-called functions:

impl Span {
    #[inline]
    pub const fn new(start: u32, end: u32) -> Self { ... }

    #[inline]
    pub const fn len(&self) -> u32 { ... }
}

`#[track_caller]`

Used on panicking accessors for better error messages:

impl ExprArena {
    #[inline]
    #[track_caller]
    pub fn get_expr(&self, id: ExprId) -> &Expr {
        &self.exprs[id.index()]
    }
}

`#[cold]`

Used on error factory functions to hint they’re unlikely paths:

// eval/errors.rs - all 33 error factories marked #[cold]
#[cold]
pub fn division_by_zero() -> EvalError {
    EvalError::new("division by zero")
}

#[cold]
pub fn undefined_variable(name: &str) -> EvalError {
    EvalError::new(format!("undefined variable: {}", name))
}

Guidelines

Do

Use arena allocation for AST nodes
Intern all identifiers
Use Arc for shared heap values
Make small types Copy
Clean up scopes immediately
Mark error paths as #[cold]
Add #[track_caller] to panicking functions
Use ensure_sufficient_stack in recursive functions

Don’t

Box individual expressions
Store String in AST (use Name)
Clone large structures unnecessarily
Keep references to temporary values
Leak memory in error paths
Deeply recurse without stack safety