Appendix B: Memory Management
Memory management strategies used in the Ori compiler.
Stack Safety
Deeply nested expressions can cause stack overflow in recursive parsing, type-checking, and evaluation. The compiler uses the stacker crate to dynamically grow the stack when needed:
// src/stack.rs
const RED_ZONE: usize = 100 * 1024; // 100KB minimum remaining
const STACK_PER_RECURSION: usize = 1024 * 1024; // 1MB growth
#[inline]
pub fn ensure_sufficient_stack<R>(f: impl FnOnce() -> R) -> R {
stacker::maybe_grow(RED_ZONE, STACK_PER_RECURSION, f)
}
Applied to recursive functions:
// Parser
pub fn parse_expr(&mut self) -> Result<ExprId, ParseError> {
ensure_sufficient_stack(|| self.parse_expr_inner())
}
// Type checker
pub fn infer_expr(checker: &mut TypeChecker<'_>, expr_id: ExprId) -> Type {
ensure_sufficient_stack(|| infer_expr_inner(checker, expr_id))
}
// Evaluator
pub fn eval(&mut self, expr_id: ExprId) -> EvalResult {
ensure_sufficient_stack(|| self.eval_inner(expr_id))
}
This prevents stack overflow on deeply nested code like ((((((((((1)))))))))) while adding minimal overhead to normal execution.
Arena Allocation
Expressions use arena allocation:
pub struct ExprArena {
exprs: Vec<Expr>,
}
impl ExprArena {
pub fn alloc(&mut self, expr: Expr) -> ExprId {
let id = ExprId(self.exprs.len() as u32);
self.exprs.push(expr);
id
}
}
Benefits:
- Contiguous memory (cache-friendly)
- No individual deallocations
- Simple lifetime management
String Interning
All identifiers are interned:
pub struct Interner {
strings: Vec<String>,
lookup: HashMap<String, Name>,
}
Memory savings:
- “foo” appears 100 times → stored once
- Name is 4 bytes vs String’s ~24 bytes
Arc for Shared Values
Runtime values use Arc for sharing:
pub enum Value {
String(Arc<String>),
List(Arc<Vec<Value>>),
// ...
}
Why Arc:
- Closures capture environment by cloning
- Multiple references to same list
- Safe concurrent access
SharedRegistry vs SharedMutableRegistry
The compiler uses two registry patterns:
SharedRegistry (Immutable)
For registries that are built completely before use:
pub struct SharedRegistry<T>(Arc<T>);
impl<T> SharedRegistry<T> {
pub fn new(registry: T) -> Self {
SharedRegistry(Arc::new(registry))
}
}
Use when:
- Registry is fully populated before access
- No modifications needed after construction
- Salsa query compatibility required
SharedMutableRegistry (Interior Mutability)
For registries that need modification after dependent structures are built:
pub struct SharedMutableRegistry<T>(Arc<parking_lot::RwLock<T>>);
impl<T> SharedMutableRegistry<T> {
pub fn new(registry: T) -> Self {
SharedMutableRegistry(Arc::new(parking_lot::RwLock::new(registry)))
}
pub fn read(&self) -> RwLockReadGuard<'_, T> {
self.0.read()
}
pub fn write(&self) -> RwLockWriteGuard<'_, T> {
self.0.write()
}
}
Use when:
- Need to add entries after construction
- Dependent structures (like cached dispatchers) must see updates
- Acceptable trade-off: RwLock overhead vs rebuilding cost
Example: Method Dispatch Caching
The MethodDispatcher is cached in the Evaluator to avoid rebuilding the resolver
chain on every method call. However, load_module() registers new methods after
the Evaluator is constructed. Using SharedMutableRegistry<UserMethodRegistry>:
// In EvaluatorBuilder::build():
let user_method_registry = SharedMutableRegistry::new(UserMethodRegistry::new());
let method_dispatcher = MethodDispatcher::new(vec![
Box::new(UserMethodResolver::new(user_method_registry.clone())),
// ... other resolvers
]);
// In load_module():
self.user_method_registry.write().merge(new_methods);
// In method resolution:
if let Some(method) = self.registry.read().lookup(type_name, method_name) { ... }
This avoids 4 Box allocations per method call while still allowing dynamic method registration.
Heap Wrapper
Ensures consistent allocation:
pub struct Heap<T>(Arc<T>);
impl<T> Heap<T> {
pub fn new(value: T) -> Self {
Self(Arc::new(value))
}
}
Prevents:
- Accidental bare Arc creation
- Inconsistent allocation patterns
Copy Types
Small types are Copy:
#[derive(Clone, Copy)]
pub struct ExprId(u32);
#[derive(Clone, Copy)]
pub struct Name(u32);
#[derive(Clone, Copy)]
pub struct Span { start: u32, end: u32 }
Benefits:
- No heap allocation
- Trivial to pass around
- No lifetime complications
MethodKey
Type-safe key for method registry lookups:
#[derive(Clone, Eq, PartialEq, Hash, Debug)]
pub struct MethodKey {
pub type_name: String,
pub method_name: String,
}
impl MethodKey {
pub fn new(type_name: impl Into<String>, method_name: impl Into<String>) -> Self;
pub fn from_strs(type_name: &str, method_name: &str) -> Self;
}
impl Display for MethodKey {
fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
write!(f, "{}::{}", self.type_name, self.method_name)
}
}
Benefits:
- Type-safe method lookups (vs tuple of strings)
- Better error messages (
Point::distanceinstead of("Point", "distance")) - Hashable for use in registries
Token Storage
Tokens stored in parallel arrays:
pub struct TokenList {
kinds: Vec<TokenKind>,
spans: Vec<Span>,
}
Better than Vec<Token> because:
- TokenKind often accessed without span
- Better memory locality for iteration
Module Caching
Evaluated modules are cached:
pub struct ModuleCache {
cache: HashMap<PathBuf, ModuleEvalResult>,
}
Prevents:
- Re-evaluating same module
- Memory bloat from duplicates
Scope Cleanup
Scopes are cleaned up immediately:
fn eval_let(&mut self, name: Name, value: ExprId, body: ExprId) -> Result<Value, EvalError> {
let value = self.eval_expr(value)?;
self.env.push_scope();
self.env.bind(name, value);
let result = self.eval_expr(body);
self.env.pop_scope(); // Immediate cleanup
result
}
Type Representation
Types avoid excessive boxing:
// Primitives are inline
Type::Int
Type::Bool
// Compound types box only where needed
Type::List(Box<Type>) // One allocation
Type::Function { params: Vec<Type>, ret: Box<Type> }
Memory Profiling
For large programs:
# Run with memory profiler
ORI_PROFILE_MEMORY=1 ori run large_file.ori
# Output
Arena: 1.2 MB (12,000 expressions)
Interner: 0.3 MB (5,000 strings)
Values: 2.1 MB
Total: 3.6 MB
Performance Annotations
The compiler uses Rust attributes to help the optimizer:
#[inline]
Used on small, frequently-called functions:
impl Span {
#[inline]
pub const fn new(start: u32, end: u32) -> Self { ... }
#[inline]
pub const fn len(&self) -> u32 { ... }
}
#[track_caller]
Used on panicking accessors for better error messages:
impl ExprArena {
#[inline]
#[track_caller]
pub fn get_expr(&self, id: ExprId) -> &Expr {
&self.exprs[id.index()]
}
}
#[cold]
Used on error factory functions to hint they’re unlikely paths:
// eval/errors.rs - all 33 error factories marked #[cold]
#[cold]
pub fn division_by_zero() -> EvalError {
EvalError::new("division by zero")
}
#[cold]
pub fn undefined_variable(name: &str) -> EvalError {
EvalError::new(format!("undefined variable: {}", name))
}
Guidelines
Do
- Use arena allocation for AST nodes
- Intern all identifiers
- Use Arc for shared heap values
- Make small types Copy
- Clean up scopes immediately
- Mark error paths as
#[cold] - Add
#[track_caller]to panicking functions - Use
ensure_sufficient_stackin recursive functions
Don’t
- Box individual expressions
- Store String in AST (use Name)
- Clone large structures unnecessarily
- Keep references to temporary values
- Leak memory in error paths
- Deeply recurse without stack safety