Proposal: AOT Compilation

Status: Approved Author: Eric (with AI assistance) Created: 2026-01-31 Approved: 2026-01-31 Affects: Compiler, tooling, CLI


Summary

This proposal formalizes the Ahead-of-Time (AOT) compilation pipeline for Ori, covering object file generation, optimization passes, linking, debug information, and target configuration.


Problem Statement

The LLVM backend (Phase 21A) currently supports JIT compilation for testing and development. Production deployment requires:

  1. Native executables: Standalone binaries without runtime compilation
  2. Libraries: Shared/static libraries for FFI and interoperability
  3. Optimized output: Production-grade optimization passes
  4. Debug support: Source-level debugging with DWARF/CodeView
  5. Cross-compilation: Building for targets other than the host
  6. WASM output: WebAssembly modules for browser/Node.js

Current State

FeatureJIT (21A)AOT (21B)
LLVM IR generationWorkingSame
In-memory executionWorkingN/A
Object file outputMissingRequired
LinkingN/ARequired
OptimizationLimitedFull pipeline
Debug infoNoneRequired

Goals

  1. Generate native executables from Ori source
  2. Support multiple target platforms (Linux, macOS, Windows)
  3. Enable optimized release builds
  4. Provide debuggable development builds
  5. Support WebAssembly output
  6. Enable incremental compilation

Terminology

TermDefinition
AOTAhead-of-Time compilation; generates machine code before execution
Object fileIntermediate compiled unit containing machine code and metadata
LinkerTool that combines object files into executables or libraries
Target triplePlatform identifier (e.g., x86_64-unknown-linux-gnu)
Data layoutMemory representation specification for the target
DWARFDebug information format for Unix-like systems
CodeViewDebug information format for Windows

Design

Compilation Pipeline

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Source    │───▶│    Parse    │───▶│  Type Check │───▶│   LLVM IR   │
│   (.ori)    │    │    (AST)    │    │   (Types)   │    │  Generation │
└─────────────┘    └─────────────┘    └─────────────┘    └──────┬──────┘

                   ┌─────────────┐    ┌─────────────┐    ┌──────▼──────┐
                   │ Executable  │◀───│    Link     │◀───│   Object    │
                   │   / Lib     │    │             │    │    File     │
                   └─────────────┘    └─────────────┘    └─────────────┘

AOT-specific stages:

  1. Object generation: Emit .o/.obj files from LLVM IR
  2. Optimization: Run LLVM optimization passes on IR
  3. Debug info: Embed source locations and type information
  4. Linking: Combine objects with runtime library into final artifact

Target Configuration

Target Triple

The target is specified as a triple: <arch>-<vendor>-<os>[-<env>]

ComponentExamplesDescription
archx86_64, aarch64, wasm32CPU architecture
vendorunknown, apple, pcHardware vendor
oslinux, darwin, windows, wasiOperating system
envgnu, musl, msvcABI/environment

Supported targets (initial):

TargetDescription
x86_64-unknown-linux-gnu64-bit Linux (glibc)
x86_64-unknown-linux-musl64-bit Linux (musl, static)
x86_64-apple-darwin64-bit macOS (Intel)
aarch64-apple-darwin64-bit macOS (Apple Silicon)
x86_64-pc-windows-msvc64-bit Windows (MSVC)
x86_64-pc-windows-gnu64-bit Windows (MinGW)
wasm32-unknown-unknownWebAssembly (standalone)
wasm32-wasiWebAssembly (WASI)

Data Layout

LLVM data layout string specifies:

  • Endianness (e = little, E = big)
  • Pointer size and alignment
  • Type alignments
  • Stack alignment

Example for x86_64-unknown-linux-gnu:

e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128

CPU Features

Optional CPU-specific features can be enabled:

ori build --target=x86_64-unknown-linux-gnu --cpu=native
ori build --target=x86_64-unknown-linux-gnu --features=+avx2,+fma

Optimization Levels

LevelFlagDescriptionUse Case
O0--opt=0No optimizationFastest compile, debugging
O1--opt=1Basic optimizationDevelopment with some speed
O2--opt=2Standard optimizationProduction default
O3--opt=3Aggressive optimizationMaximum performance
Os--opt=sOptimize for sizeEmbedded, WASM
Oz--opt=zMinimize size aggressivelySmallest binary

Default: --opt=0 for ori run, --opt=2 for ori build --release

Release Mode Defaults

The --release flag sets multiple defaults:

FlagDefault (debug)Default (—release)
--opt02
--debug2 (full)0 (none)
--ltooffoff (opt-in)

LTO is not enabled by --release; use --lto=thin explicitly when desired.

Optimization Passes

The optimization pipeline follows LLVM’s standard pass manager:

O1 passes:

  • Early CSE (common subexpression elimination)
  • Simplify CFG
  • Instruction combining
  • Reassociate
  • Dead code elimination

O2 passes (adds):

  • Loop invariant code motion
  • Global value numbering
  • Aggressive dead code elimination
  • Inline small functions
  • Loop unrolling (limited)

O3 passes (adds):

  • Aggressive inlining
  • Loop vectorization
  • SLP vectorization
  • Full unrolling

Debug Information

Debug Levels

LevelFlagDescription
None--debug=0No debug info
Line tables--debug=1Source locations only
Full--debug=2Variables, types, source

Default: --debug=2 for development, --debug=0 for release

Debug Format

PlatformFormatStandard
LinuxDWARF 4Default
macOSDWARF 4 + dSYMSplit debug
WindowsCodeView/PDBMSVC standard
WASMDWARF 4Source maps

Source Map Generation

For debugging, the compiler emits:

// DILocation for each expression
let loc = di_builder.create_debug_location(
    line: u32,
    column: u32,
    scope: DIScope,
);
builder.set_current_debug_location(loc);

Object File Generation

Object Format

PlatformFormatExtension
LinuxELF.o
macOSMach-O.o
WindowsCOFF.obj
WASMWASM.wasm

Module-to-Object Mapping

Each Ori module produces one object file:

src/
├── main.ori       → build/obj/main.o
├── utils.ori      → build/obj/utils.o
└── http/
    ├── client.ori → build/obj/http/client.o
    └── server.ori → build/obj/http/server.o

Symbol Naming

Ori symbols are mangled for uniqueness:

OriMangled
@main_ori_main
@foo (x: int) -> int_ori_foo_i
MyModule.@bar_ori_MyModule_bar
impl Type.@method_ori_Type_method
impl Type: Trait.@method_ori_Trait_Type_method

Demangling: The ori demangle command converts mangled names back.

Linking

┌─────────────────────────────────────────────────────────────┐
│                      Final Executable                        │
├─────────────────────────────────────────────────────────────┤
│  User Object Files     │  Ori Runtime      │  System Libs   │
│  (main.o, utils.o)     │  (libori_rt.a)    │  (libc, libm)  │
└─────────────────────────────────────────────────────────────┘

Runtime Library

The Ori runtime (libori_rt) consolidates all runtime support functions:

CategoryFunctions
Memoryori_alloc, ori_free, ori_realloc
Reference countingori_rc_inc, ori_rc_dec, ori_rc_new
Stringsori_str_concat, ori_str_from_int, etc.
Collectionsori_list_new, ori_map_new, etc.
Panicori_panic, ori_panic_handler
I/Oori_print, ori_stdin_read

Note: JIT mode (Phase 21A) uses the same runtime functions but links them dynamically at JIT compile time rather than statically embedding them in the executable. The runtime API is identical; only the linking mechanism differs.

Linking modes:

ModeFlagDescription
Static--link=staticEmbed runtime (default)
Dynamic--link=dynamicLink to libori_rt.so

System Linker

The compiler invokes the system linker:

PlatformLinkerNotes
Linuxld or lldVia cc driver
macOSld64Via clang driver
Windowslink.exe or lld-linkVia cl or direct

Linker selection:

ori build --linker=lld      # Use LLVM's LLD
ori build --linker=system   # Use system default
ModeFlagDescription
None--lto=offNo LTO (default debug)
Thin--lto=thinFast parallel LTO
Full--lto=fullMaximum optimization

Output Artifacts

Build Outputs

CommandOutputDescription
ori build./build/debug/<name>Debug executable
ori build --release./build/release/<name>Release executable
ori build --lib./build/<name>.aStatic library
ori build --dylib./build/<name>.soShared library
ori build --wasm./build/<name>.wasmWebAssembly module

Output Control

ori build -o myapp              # Custom output name
ori build --out-dir ./dist      # Custom output directory
ori build --emit=obj            # Stop at object files
ori build --emit=llvm-ir        # Emit LLVM IR
ori build --emit=asm            # Emit assembly

Incremental Compilation

Caching Strategy

build/
├── cache/
│   ├── main.ori.hash           # Source hash
│   ├── main.o                  # Cached object
│   └── deps/
│       └── main.deps           # Dependency list
└── release/
    └── myapp                   # Final binary

Recompilation triggers:

  1. Source file changed (hash mismatch)
  2. Dependency changed (transitive)
  3. Compiler flags changed
  4. Compiler version changed

Parallel Compilation

Independent modules compile in parallel:

ori build --jobs=8              # 8 parallel compilations
ori build --jobs=auto           # Use all cores (default)

WebAssembly Backend

WASM Targets

TargetDescriptionUse Case
wasm32-unknown-unknownStandalone WASMEmbedded, plugins
wasm32-wasiWASI preview 2CLI tools, servers
wasm32-emscriptenEmscriptenBrowser with full API

JavaScript Interop

For browser targets, generate bindings:

ori build --wasm --js-bindings  # Generate .js glue code

Output:

build/
├── myapp.wasm
├── myapp.js         # JavaScript bindings
└── myapp.d.ts       # TypeScript declarations

WASM Optimizations

ori build --wasm --opt=z        # Smallest WASM
ori build --wasm --wasm-opt     # Run wasm-opt post-processor

Error Handling

Linker Errors

error[E1201]: linker failed
  --> linking myapp
   |
   = note: undefined reference to `external_function`
   = note: linker command: ld -o myapp main.o ...
   = help: ensure all external functions are available

Missing Target

error[E1202]: unsupported target
  --> --target=riscv64-unknown-linux-gnu
   |
   = note: target 'riscv64-unknown-linux-gnu' is not supported
   = note: supported targets: x86_64-unknown-linux-gnu, ...
   = help: run `ori targets` to list all supported targets

Object Generation Failed

error[E1203]: failed to generate object file
  --> src/main.ori
   |
   = note: LLVM error: <llvm message>
   = help: this may be a compiler bug; please report

CLI Interface

New Commands

ori build

ori build [OPTIONS] [FILE]

Options:
    --release           Build with optimizations (O2, no debug)
    --target=TARGET     Target triple (default: native)
    --opt=LEVEL         Optimization level: 0, 1, 2, 3, s, z
    --debug=LEVEL       Debug info level: 0, 1, 2
    --lib               Build as static library
    --dylib             Build as shared library
    --wasm              Build for WebAssembly
    -o, --output=FILE   Output file name
    --out-dir=DIR       Output directory
    --emit=TYPE         Emit: obj, llvm-ir, llvm-bc, asm
    --linker=LINKER     Linker: system, lld
    --link=MODE         Link mode: static, dynamic
    --lto=MODE          LTO: off, thin, full
    --jobs=N            Parallel compilation jobs
    --cpu=CPU           Target CPU (e.g., native, skylake)
    --features=FEAT     CPU features (+avx2, -sse4)
    --js-bindings       Generate JavaScript bindings (WASM)
    --wasm-opt          Run wasm-opt post-processor
    -v, --verbose       Verbose output

ori targets

ori targets                     # List all supported targets
ori targets --installed         # List targets with sysroots installed

ori target (Cross-Compilation)

ori target add x86_64-unknown-linux-gnu     # Download/configure sysroot
ori target add aarch64-apple-darwin         # Add Apple Silicon target
ori target remove x86_64-pc-windows-msvc    # Remove sysroot
ori target list                             # List installed targets

Cross-compilation requires the target sysroot (headers, libraries). The ori target add command downloads and configures the sysroot for the specified target.

ori demangle

ori demangle _ori_MyModule_foo  # → MyModule.@foo

Modified Commands

ori run

Adds AOT mode for faster repeated runs:

ori run src/main.ori            # JIT (default, fast startup)
ori run --compile src/main.ori  # AOT (slower startup, faster run)

ori check

No changes; type checking is independent of codegen.


Implementation Architecture

New Crate Structure

compiler/
├── ori_llvm/
│   ├── src/
│   │   ├── aot/                    # New: AOT-specific code
│   │   │   ├── mod.rs
│   │   │   ├── object.rs           # Object file emission
│   │   │   ├── linker.rs           # Linker invocation
│   │   │   ├── target.rs           # Target configuration
│   │   │   ├── debug_info.rs       # DWARF/CodeView emission
│   │   │   └── passes.rs           # Optimization pass manager
│   │   ├── wasm/                   # New: WASM-specific
│   │   │   ├── mod.rs
│   │   │   ├── bindings.rs         # JS binding generation
│   │   │   └── wasi.rs             # WASI support
│   │   └── ... (existing JIT code)
│   └── Cargo.toml
└── oric/
    └── src/
        └── commands/
            ├── build.rs            # New: AOT build command
            └── ... (existing)

Key Types

/// Target configuration for AOT compilation
pub struct TargetConfig {
    pub triple: String,
    pub cpu: Option<String>,
    pub features: Vec<String>,
    pub data_layout: String,
}

/// Compilation options
pub struct CompileOptions {
    pub target: TargetConfig,
    pub opt_level: OptLevel,
    pub debug_level: DebugLevel,
    pub lto: LtoMode,
    pub emit: EmitKind,
}

/// Build output configuration
pub struct BuildConfig {
    pub output_type: OutputType,      // Executable, StaticLib, DynLib, WASM
    pub output_path: PathBuf,
    pub link_mode: LinkMode,          // Static, Dynamic
    pub incremental: bool,
    pub parallel_jobs: usize,
}

pub enum OptLevel { O0, O1, O2, O3, Os, Oz }
pub enum DebugLevel { None, LineTablesOnly, Full }
pub enum LtoMode { Off, Thin, Full }
pub enum EmitKind { Object, LlvmIr, LlvmBc, Asm, Exe }
pub enum OutputType { Executable, StaticLib, DynLib, Wasm }

Compilation Flow

pub fn compile_aot(
    sources: &[PathBuf],
    options: &CompileOptions,
    build: &BuildConfig,
) -> Result<(), CompileError> {
    // 1. Parse and type-check (existing)
    let modules = parse_and_check(sources)?;

    // 2. Configure LLVM target
    let target = configure_target(&options.target)?;

    // 3. Generate LLVM IR (existing, with debug info)
    let llvm_modules = modules.iter()
        .map(|m| generate_ir(m, &target, options.debug_level))
        .collect::<Result<Vec<_>, _>>()?;

    // 4. Run optimization passes
    for module in &llvm_modules {
        run_passes(module, options.opt_level, options.lto)?;
    }

    // 5. Emit object files
    let objects = llvm_modules.iter()
        .map(|m| emit_object(m, &target, options.emit))
        .collect::<Result<Vec<_>, _>>()?;

    // 6. Link
    if build.output_type != OutputType::Object {
        link(&objects, &build.output_path, &options.target, build.link_mode)?;
    }

    Ok(())
}

Migration from JIT

Shared Code

Most LLVM codegen is shared between JIT and AOT:

  • Type lowering
  • Expression compilation
  • Control flow
  • Pattern matching
  • Runtime function declarations

AOT-Specific Code

New code required for AOT:

  • Target machine creation
  • Object file emission
  • Debug info generation
  • Linker driver
  • Incremental caching

Testing Strategy

Test TypeJITAOT
Unit testsPrimaryVerify parity
Spec testsBoth (parallel)Both (parallel)
PerformanceN/ABenchmarks
DebugN/ADebugger tests

Interaction with Other Features

Conditional Compilation

#target() and #cfg() are evaluated at compile time:

#target(os: "linux")
@platform_name () -> str = "Linux"

#target(os: "windows")
@platform_name () -> str = "Windows"

Only the matching variant is compiled into the object file.

FFI

External functions resolve at link time:

extern "c" from "mylib" {
    @_native_call (x: int) -> int as "native_call"
}

Linker command includes -lmylib.

Capabilities

Capability resolution happens at compile time; no runtime impact on AOT.


Performance Considerations

Compile Time

FactorImpactMitigation
LLVM optimizationHighIncremental, parallel
Debug infoMediumOptional levels
LTOVery highThin LTO, optional
LinkingMediumLLD, parallel

Expected compile times (100k LOC project):

ModeTime
Debug (O0)~10s
Release (O2)~30s
Release + LTO~60s

Runtime Performance

AOT-compiled code should match or exceed JIT performance:

  • Same LLVM optimization passes
  • No JIT compilation overhead at startup
  • Better cache locality (code in executable)

Spec Changes Required

None. AOT compilation is an implementation detail; the language semantics are unchanged.


Roadmap Changes Required

Update phase-21B-aot.md

Replace placeholder content with detailed implementation tasks:

  1. 21B.1: Target Configuration

    • Target triple parsing and validation
    • Data layout configuration
    • CPU feature detection
  2. 21B.2: Object File Emission

    • LLVM TargetMachine creation
    • Object file writing (ELF/Mach-O/COFF)
    • Symbol mangling
  3. 21B.3: Debug Information

    • DIBuilder integration
    • Source location tracking
    • Type debug info
    • DWARF/CodeView emission
  4. 21B.4: Optimization Pipeline

    • Pass manager configuration
    • Optimization levels (O0-O3, Os, Oz)
    • LTO support
  5. 21B.5: Linking

    • Linker driver (cc/clang/link.exe)
    • Runtime library (libori_rt)
    • System library detection
  6. 21B.6: Incremental Compilation

    • Source hashing
    • Dependency tracking
    • Cache management
  7. 21B.7: WebAssembly Backend

    • WASM target configuration
    • JavaScript binding generation
    • WASI support
  8. 21B.8: CLI Integration

    • ori build command
    • ori targets command
    • Flag parsing

Summary Table

AspectDesign Decision
Object formatPlatform-native (ELF/Mach-O/COFF)
Default optimizationO0 debug, O2 release
Default debug infoFull debug, none release
LinkingStatic runtime by default
LinkerSystem default, LLD optional
LTOOff by default, thin recommended
IncrementalHash-based, parallel
WASMStandalone and WASI targets
Symbol mangling_ori_<module>_<function>

  • Phase 21A (LLVM Backend): JIT implementation (prerequisite)
  • FFI Proposal: External function linking
  • Conditional Compilation: Target-specific code
  • WASM FFI Proposal: JavaScript interop (future)

Design Decisions

macOS Debug Symbols (dSYM)

Decision: Separate dSYM files by default on macOS.

Rationale:

  • Standard macOS practice; debuggers expect this
  • Smaller distributed binaries
  • Debug symbols can be archived separately

Cross-Compilation

Decision: Supported via ori target add <target> sysroot management.

The compiler supports cross-compilation when the target sysroot is installed. Use ori target add to download and configure sysroots for non-native targets.

Incremental Compilation Cache

Decision: Project-local cache in build/cache/.

Rationale:

  • Simpler cache invalidation logic
  • No cross-project cache coherency issues
  • Each project manages its own cache lifetime

LLVM Version

Decision: LLVM 21 or later required.

Rationale:

  • Best WASM support with Component Model preview
  • Newest pass manager (default since LLVM 14)
  • Improved debug info generation
  • No legacy compatibility burden
  • Current development uses LLVM 21

LTO Default

Decision: LTO is opt-in; --release does not imply LTO.

Rationale:

  • LTO significantly increases compile time
  • Users who need maximum optimization can explicitly add --lto=thin
  • Default release builds prioritize reasonable compile times