Overview
This documentation describes the architecture and design decisions of the Ori compiler.
Design Principle: Lean Core, Rich Libraries
The compiler implements only constructs that require special syntax or static analysis. Everything else belongs in the standard library.
| Location | What | Why |
|---|---|---|
| Compiler | run, try, match, recurse, parallel, spawn, timeout, cache, with | Require special syntax, bindings, self(), concurrency primitives, or capability checking |
| Stdlib | map, filter, fold, find, retry, validate | Standard method calls on collections; no special compiler support needed |
This keeps the compiler small (~30K lines), focused, and maintainable. The stdlib can evolve without compiler changes. When considering new features, ask: “Does this need special syntax or static analysis?” If no, it’s a library function.
Overview
The Ori compiler is a Rust-based incremental compiler built on the Salsa framework. It is organized as a multi-crate workspace with clear separation of concerns:
ori_ir- Core IR types with no dependencies (AST, arena, interning, derives)ori_diagnostic- Error reporting systemori_lexer- Tokenizationori_types- Type system definitionsori_parse- Recursive descent parserori_typeck- Type checking and inferenceori_patterns- Pattern system, Value types, EvalError (single source of truth)ori_eval- Core evaluator components (Environment, operators)oric- CLI orchestrator, Salsa queries, evaluator, reporting
The compiler features:
- Incremental compilation via Salsa’s automatic caching and dependency tracking
- Flat AST representation using arena allocation for memory efficiency
- String interning for O(1) identifier comparison
- Extensible pattern system with registry-based pattern definitions
- Comprehensive diagnostics with code fixes and multiple output formats
Statistics
| Component | Lines of Code | Purpose |
|---|---|---|
| IR | ~4,500 | AST types, arena, visitor, interning |
| Evaluator | ~5,500 | Tree-walking interpreter |
| Type System | ~4,300 | Type checking, inference, TypeContext |
| Parser | ~3,200 | Recursive descent parsing |
| Patterns | ~3,000 | Pattern system and builtins |
| Diagnostics | ~2,800 | Error reporting, DiagnosticQueue, fixes |
| Lexer | ~700 | DFA-based tokenization |
| Tests | ~1,100 | Test discovery, execution, error matching |
| Total | ~30,000 |
Compilation Pipeline
flowchart TB
A["SourceFile (Salsa input)"] -->|"tokens() query"| B["TokenList"]
B -->|"parsed() query"| C["ParseResult { Module, ExprArena, errors }"]
C -->|"typed() query"| D["TypedModule { expr_types, errors }"]
D -->|"evaluated() query"| E["ModuleEvalResult { Value, EvalOutput }"]
D -.->|"codegen() query (pending)"| F["LLVM IR → Native Binary"]
style F stroke-dasharray: 5 5
Each step is a Salsa query with automatic caching. If the input doesn’t change, the cached output is returned immediately.
Documentation Sections
Architecture
- Architecture Overview - High-level compiler structure
- Compilation Pipeline - Query-based pipeline design
- Salsa Integration - Incremental compilation framework
- Data Flow - How data moves through the compiler
Intermediate Representation
- IR Overview - Data structures for compilation
- Flat AST - Arena-based expression storage
- Arena Allocation - Memory management strategy
- String Interning - Identifier deduplication
- Type Representation - Runtime type encoding
Lexer
- Lexer Overview - Tokenization design
- Token Design - Token types and structure
Parser
- Parser Overview - Parsing architecture
- Recursive Descent - Parsing approach
- Error Recovery - Handling syntax errors
- Grammar Modules - Module organization
Type System
- Type System Overview - Type checking architecture
- Type Inference - Hindley-Milner inference
- Unification - Type constraint solving
- Type Environment - Scope-based type tracking
- Type Registry - User-defined type storage
Pattern System
- Pattern System Overview - Pattern architecture
- Pattern Trait - PatternDefinition interface
- Pattern Registry - Pattern lookup system
- Pattern Fusion - Optimization passes
- Adding Patterns - How to add new patterns
Evaluator
- Evaluator Overview - Interpretation architecture
- Tree Walking - Execution strategy
- Environment - Variable scoping
- Value System - Runtime value representation
- Module Loading - Import resolution
Diagnostics
- Diagnostics Overview - Error reporting system
- Problem Types - Error categorization
- Code Fixes - Automatic fix suggestions
- Emitters - Output format handlers
Testing
- Testing Overview - Test system architecture
- Test Discovery - Finding test functions
- Test Runner - Parallel test execution
Appendices
- Salsa Patterns - Common Salsa usage patterns
- Memory Management - Allocation strategies
- Error Codes - Complete error code reference
- Debugging - Debug flags and tracing
- Coding Guidelines - Code style, testing, best practices
Source Paths
The compiler is organized as a multi-crate workspace:
| Crate | Path | Purpose |
|---|---|---|
ori_ir | compiler/ori_ir/src/ | Core IR types (tokens, spans, AST, arena, interning, derives) |
ori_diagnostic | compiler/ori_diagnostic/src/ | DiagnosticQueue, error reporting, suggestions, emitters |
ori_lexer | compiler/ori_lexer/src/ | Tokenization via logos |
ori_types | compiler/ori_types/src/ | Type, TypeError, TypeContext, InferenceContext |
ori_parse | compiler/ori_parse/src/ | Recursive descent parser |
ori_typeck | compiler/ori_typeck/src/ | Type checking, inference, BuiltinMethodRegistry |
ori_patterns | compiler/ori_patterns/src/ | Pattern definitions, Value types, EvalError, EvalContext |
ori_eval | compiler/ori_eval/src/ | Environment, OperatorRegistry (core eval components) |
ori-macros | compiler/ori-macros/src/ | Diagnostic derive macros |
oric | compiler/oric/src/ | CLI, Salsa queries, eval orchestration, reporting |
Note: oric modules (ir, parser, diagnostic, types) re-export from source crates for DRY.
oric Internal Paths
| Component | Path |
|---|---|
| Library root | compiler/oric/src/lib.rs |
| Salsa database | compiler/oric/src/db.rs |
| Query system | compiler/oric/src/query/ |
| Evaluator | compiler/oric/src/eval/ |
| Problem types | compiler/oric/src/problem/ |
| Diagnostic rendering | compiler/oric/src/reporting/ |
| Tests | compiler/oric/src/test/ |
Architecture Notes
- Patterns: Pattern definitions and Value types are in
ori_patterns. oric re-exports from this crate. - Environment: The
Environmenttype for variable scoping is inori_eval. oric uses this directly. - Re-exports: oric modules (
ir,types,diagnostic) re-export from their source crates for DRY.