Section 01: Representation IR & Decision Framework
Context: Today, ori_llvm::codegen::type_info::store.rs maps Tag to TypeInfo (e.g., Tag::Int → TypeInfo::Int) in compute_type_info_inner(), and info.rs maps TypeInfo to LLVM types (e.g., TypeInfo::Int → i64) in storage_type(), with companion methods size(), alignment(), and is_trivial(). To support narrowing, we need a centralized decision document that multiple analysis passes can populate and codegen can read.
Reference implementations:
- Lean4
src/Lean/Compiler/LCNF/Types.lean: Phase-separated IR where semantic types and machine types are distinct data structures - Zig
src/InternPool.zig: Layout information interned alongside types — each type has pre-computed size/alignment - Roc
crates/compiler/mono/src/layout/intern.rs:STLayoutInternermaps type variables to concrete layouts after monomorphization
Depends on: Nothing — this is the foundation.
01.1 MachineRepr Enum & ReprPlan Data Structure
File(s): compiler/ori_repr/src/lib.rs (NEW crate), compiler/ori_repr/src/repr.rs
File layout (~1,130 production lines across 6 files, all under the 500-line limit):
| File | Contents | Est. Lines |
|---|---|---|
lib.rs | Module declarations, pub use re-exports | ~30 |
repr.rs | MachineRepr enum + sub-repr types (StructRepr, EnumRepr, etc.) | ~350 |
plan.rs | ReprPlan struct + builder + query methods | ~300 |
query.rs | Ergonomic query interface (int_width, is_trivial, escapes, rc_strategy) | ~150 |
repr_attrs.rs | ReprAttribute enum + validation | ~100 |
canonical.rs | canonical(tag) mapping for all Tag variants | ~200 |
tests.rs | All tests (sibling to lib.rs — tests exempt from 500-line limit) | unlimited |
The MachineRepr enum captures the physical representation chosen for each type. It must be rich enough to express all optimizations in §02-§11 but simple enough that codegen can pattern-match exhaustively.
-
Create new crate
ori_reprwithCargo.tomlentry- Dependencies:
ori_types(forPool,Idx,Tag),ori_ir(forName— the interned function identifier),rustc-hash(workspace dep — forFxHashMap/FxHashSet) - Dependencies (added by later sections):
ori_arc(forArcFunction,ArcVarId— used by §03 range analysis and §08 escape analysis) - No dependency on
ori_llvm— this is backend-independent - No dependency on
ori_eval— this is evaluation-independent - Architecture:
ori_types→ori_arc→ori_repr→ori_llvm(no cycle —ori_reprreads fromori_arcIR types butori_arcdoes not depend onori_repr) - Verified:
ori_typeshasPool,Idx,Tagin its pub API;rustc-hashis a workspace dep used byori_types,ori_arc, andori_llvm - Add
#![deny(unsafe_code)]toori_repr/src/lib.rs(pure analysis crate, same asori_ir,ori_types,ori_lexer)
- Dependencies:
-
Define
MachineReprenum:/// The physical representation of a type in generated code. /// Every Idx in the Pool maps to exactly one MachineRepr. #[derive(Debug, Clone, PartialEq, Eq, Hash)] pub enum MachineRepr { /// Fixed-width integer (narrowed from semantic i64) Int { width: IntWidth, signed: bool }, /// Fixed-width float (narrowed from semantic f64) Float { width: FloatWidth }, /// Boolean (always i1) Bool, /// Unicode scalar value (always i32 — 0..=0x10FFFF) Char, /// 8-bit unsigned byte (always i8) Byte, /// Duration in nanoseconds (always i64) Duration, /// Memory size in bytes (always i64) Size, /// Comparison ordering (always i8: Less=0, Equal=1, Greater=2) Ordering, /// Unit (zero-sized in memory, i64(0) as value) Unit, /// Never (uninhabited) Never, /// Struct with optimized field layout Struct(StructRepr), /// Enum with optimized discriminant and payload Enum(EnumRepr), /// Tuple (treated as anonymous struct) Tuple(TupleRepr), /// Heap-allocated reference-counted value RcPointer(RcRepr), /// Fat pointer (ptr + metadata) — used for str, [T], {K:V}, Set<T> FatPointer(FatRepr), /// Function pointer (fn ptr + optional env ptr) Closure(ClosureRepr), /// Range (always {i64 start, i64 end, i64 step, i64 inclusive}) Range, /// Stack-promoted value (was heap, promoted by escape analysis) StackPromoted { inner: Box<MachineRepr>, original_rc: bool }, /// Opaque pointer (iterator, channel — runtime-managed) OpaquePtr, } // NOTE: Box<MachineRepr> in StackPromoted, FatRepr::Collection, // RcRepr::inner, and ClosureRepr::ret causes heap allocation per type. // Acceptable: MachineRepr is computed once per type during ReprPlan // construction (not per-expression), the plan is immutable after // construction, and recursive types require indirection. If profiling // shows this matters, consider interning via MachineReprId indices. // // Add after implementation: // const _: () = assert!(std::mem::size_of::<MachineRepr>() <= 48); #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] pub enum IntWidth { I8, I16, I32, I64 } #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] pub enum FloatWidth { F32, F64 } -
Implement
canonical(tag: Tag, pool: &Pool, idx: Idx) -> MachineReprfor ALL Tag variants (this is the most critical part of §01 — it defines what “canonical” means for every Tag variant, ensuring the ReprPlan starts correct before any optimization runs):Primitives (0-11):
Tag Canonical MachineRepr LLVM Type Notes IntInt { width: I64, signed: true }i64FloatFloat { width: F64 }doubleBoolBooli1StrFatPointer(FatRepr::Str){i64, i64, ptr}len + cap + data CharChari32Unicode scalar ByteBytei8Unsigned UnitUniti64LLVM void workaround NeverNeveri64LLVM void workaround ErrorPanic/unreachable — Should never reach codegen DurationDurationi64Nanoseconds SizeSizei64Bytes OrderingOrderingi80/1/2 Simple containers (16-22):
Tag Canonical MachineRepr LLVM Type Notes ListFatPointer(FatRepr::Collection){i64, i64, ptr}len + cap + data OptionEnum(...){i8, payload}Recurse into inner SetFatPointer(FatRepr::Collection){i64, i64, ptr}len + cap + data ChannelOpaquePtrptrRuntime-managed RangeRange{i64, i64, i64, i64}start/end/step/incl IteratorOpaquePtrptrRuntime-managed DoubleEndedIteratorOpaquePtrptrRuntime-managed Two-child containers (32-34):
Tag Canonical MachineRepr LLVM Type Notes MapFatPointer(FatRepr::Collection){i64, i64, ptr}len + cap + data ResultEnum(...){i8, max(ok,err)}Recurse into ok/err BorrowedReserved — error if reached — Future use Complex types (48-51):
Tag Canonical MachineRepr Notes FunctionClosure(ClosureRepr)fn ptr + optional env ptr TupleTuple(TupleRepr)Recurse into elements StructStruct(StructRepr)Recurse into fields EnumEnum(EnumRepr)Recurse into variants Named/resolved types (80-82):
Tag Canonical MachineRepr Notes Namedpool.resolve_fully(idx)→ recurseMust resolve first — includes newtypes ( type UserId = int) and FFI types (CPtr,c_int)Appliedpool.resolve_fully(idx)→ recurseMust resolve first Aliaspool.resolve_fully(idx)→ recurseMust resolve first Newtype handling:
type UserId = intusesTag::Namedin the Pool.resolve_fully()follows the Named→concrete chain, socanonical()transparently handles newtypes by recursing into the underlying type. The TypeRegistry storesTypeKind::Newtype { underlying }for semantic purposes (.inneraccess), butcanonical()only needs the Pool-level resolution. No special case needed.FFI types:
CPtr,JsValue,c_int,c_char, etc. are named types in the FFI prelude, not Pool primitives. They resolve viaTag::Named→ concrete.CPtrresolves to an opaque pointer (MachineRepr::OpaquePtr). C numeric types resolve to their corresponding primitives. No special case needed.Type variables (96-98) — MUST NOT reach canonical:
Tag Behavior Notes VarFollow link chain via pool.resolve_fully()If unresolved → panic (typeck bug) BoundVarError — should be monomorphized Typeck bug if reached RigidVarError — should be monomorphized Typeck bug if reached Scheme/Special (112, 240-255) — MUST NOT reach canonical:
Tag Behavior SchemeError — should be instantiated ProjectionError — should be resolved ModuleNsError — not a value type InferError — should be resolved SelfTypeError — should be resolved Validation: The canonical mapping MUST produce the same LLVM types as the existing
TypeInfo::storage_type()→compute_type_info_inner()pipeline. A dedicated test iterates all types in a test Pool and assertscanonical(tag).to_llvm_type() == TypeInfo::storage_type(). -
Define
FatReprto distinguish collection/string fat pointers:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub enum FatRepr { /// String: {i64 len, i64 cap, ptr data} Str, /// Collection ([T], {K:V}, Set<T>): {i64 len, i64 cap, ptr data} Collection { element_repr: Box<MachineRepr> }, } -
Define
ClosureRepr:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub struct ClosureRepr { /// Parameter representations pub params: Vec<MachineRepr>, /// Return representation pub ret: Box<MachineRepr>, }
Derive requirement: ALL sub-repr types (StructRepr, EnumRepr, TupleRepr, FieldRepr, EnumTag, VariantRepr, RcRepr, FatRepr, ClosureRepr) MUST derive Debug, Clone, PartialEq, Eq, Hash to match MachineRepr’s derives. Code blocks below include them explicitly.
-
Define
TupleRepr:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub struct TupleRepr { /// Element representations in optimized memory order pub elements: Vec<FieldRepr>, pub size: u32, pub align: u32, pub trivial: bool, } -
Define
StructRepr:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub struct StructRepr { /// Fields in optimized memory order (may differ from declaration order) pub fields: Vec<FieldRepr>, /// Total size in bytes (including padding) pub size: u32, /// Alignment requirement pub align: u32, /// Whether all fields are trivial (no RC needed) pub trivial: bool, } #[derive(Debug, Clone, PartialEq, Eq, Hash)] pub struct FieldRepr { /// Original field index (declaration order) pub original_index: u32, /// Offset in bytes from struct start pub offset: u32, /// Machine representation of this field pub repr: MachineRepr, } -
Define
EnumRepr:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub struct EnumRepr { /// Discriminant representation pub tag: EnumTag, /// Per-variant payload representations pub variants: Vec<VariantRepr>, /// Total size including tag and padding pub size: u32, pub align: u32, } #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] pub enum EnumTag { /// Explicit tag field at offset 0 Explicit { width: IntWidth }, /// Niche — tag stored in invalid bit pattern of a field Niche { field_index: u32, niche_value: u64 }, /// No tag needed (single inhabited variant, e.g. newtype) None, } -
Define
VariantRepr:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub struct VariantRepr { /// Variant name (interned) pub name: Name, /// Field representations (empty for unit variants) pub fields: Vec<MachineRepr>, /// Size of this variant's payload (excluding tag) pub size: u32, /// Alignment of this variant's payload pub alignment: u32, } impl VariantRepr { /// Whether this variant is a pointer type (for tagged pointer optimization) pub fn is_pointer(&self) -> bool { self.fields.len() == 1 && matches!( &self.fields[0], MachineRepr::RcPointer(_) | MachineRepr::FatPointer(_) | MachineRepr::OpaquePtr ) } } -
Define
RcRepr:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub struct RcRepr { /// Width of the reference count header pub rc_width: IntWidth, /// Whether RC operations are atomic pub atomic: bool, /// The inner data representation pub inner: Box<MachineRepr>, /// Whether this is stack-promotable (escape analysis) pub stack_promotable: bool, }
01.2 ReprDecision Tracking
File(s): compiler/ori_repr/src/plan.rs
Each narrowing decision should be recorded with its justification, so that:
- Debug output can explain why a type was narrowed
- Bugs can be traced to the specific analysis that made the decision
- Later passes can query upstream decisions
-
Define
ReprDecision:#[derive(Debug, Clone)] pub struct ReprDecision { /// Which analysis pass made this decision pub source: DecisionSource, /// The semantic type this applies to pub type_idx: Idx, /// The chosen machine representation pub repr: MachineRepr, /// Why this representation was chosen (for tracing) pub reason: DecisionReason, } #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] pub enum DecisionSource { /// §02: Transitive triviality analysis Triviality, /// §03/§04: Value range → integer narrowing IntegerNarrowing, /// §03/§05: Precision analysis → float narrowing FloatNarrowing, /// §06: Struct field reordering StructLayout, /// §07: Enum niche/discriminant EnumRepr, /// §08: Escape analysis EscapeAnalysis, /// §09: ARC header compression ArcHeader, /// §10: Thread-local ARC ThreadLocal, /// §11: Collection specialization CollectionSpec, /// Default: canonical representation (no optimization) Canonical, } #[derive(Debug, Clone)] pub enum DecisionReason { /// Type is canonically this width (no narrowing applied) Canonical, /// Value range fits in narrower type RangeFits { range: ValueRange, min_width: IntWidth }, /// All fields are trivial, no RC needed TransitivelyTrivial, /// Value never escapes function scope DoesNotEscape, /// Sharing bound is within RC width BoundedSharing { max_refs: u32 }, /// Niche available in field NicheAvailable { field: u32, niche: u64 }, /// Custom reason (for tracing) Custom(String), } -
Define
ReprPlan— the central data structure:// FxHashMap from `rustc-hash` crate (workspace dep): `use rustc_hash::FxHashMap;` // Functions are identified by Name (from ori_ir), not FunctionId (ori_llvm-specific). pub struct ReprPlan { /// Per-type decisions (indexed by Pool Idx) decisions: FxHashMap<Idx, ReprDecision>, /// Per-type #repr attributes (only for structs/enums with explicit attrs) /// See §01.7 for ReprAttribute enum definition. repr_attrs: FxHashMap<Idx, ReprAttribute>, /// Per-function escape info (indexed by function Name) /// NOTE: EscapeInfo is defined in §08 (escape/mod.rs). This field is /// empty until §08 populates it. Initially use `type EscapeInfo = ();` /// as a placeholder, replaced when §08 is implemented. escape_info: FxHashMap<Name, EscapeInfo>, /// Audit trail — all decisions in order audit: Vec<ReprDecision>, } -
Implement builder pattern for populating ReprPlan:
impl ReprPlan { pub fn new() -> Self { ... } /// Record a narrowing decision. Later decisions override earlier ones /// for the same type, but the audit trail preserves both. pub fn set_repr(&mut self, idx: Idx, decision: ReprDecision) { ... } /// Query the representation for a type pub fn get_repr(&self, idx: Idx) -> Option<&MachineRepr> { ... } /// Get the canonical (un-narrowed) representation for a tag pub fn canonical(tag: Tag) -> MachineRepr { ... } /// Dump the audit trail for debugging pub fn dump_audit(&self, pool: &Pool) -> String { ... } }
01.3 Pipeline Integration Point
File(s): compiler/ori_llvm/src/codegen/type_info/mod.rs (TypeLayoutResolver), compiler/ori_llvm/src/codegen/type_info/store.rs (TypeInfoStore — Tag→TypeInfo mapping), compiler/ori_llvm/src/codegen/function_compiler/mod.rs (FunctionCompiler), compiler/ori_llvm/src/evaluator/compile.rs (JIT entry point)
The ReprPlan must be computed AFTER type checking and BEFORE LLVM codegen. The codegen must consume ReprPlan instead of computing representations inline.
-
Add
ori_reprdependency toori_llvm/Cargo.toml -
Create the ReprPlan computation entry point:
// In ori_repr/src/lib.rs pub fn compute_repr_plan(pool: &Pool, functions: &[FunctionSig]) -> ReprPlan { let mut plan = ReprPlan::new(); // Phase 1: Set canonical representations for all types populate_canonical(&mut plan, pool); // Phase 2: Triviality analysis (§02) analyze_triviality(&mut plan, pool); // Phase 3: Range analysis (§03) → Integer narrowing (§04) // → Float narrowing (§05) // (added in later sections) // Phase 4: Struct layout (§06), Enum repr (§07) // (added in later sections) // Phase 5: Escape analysis (§08) → ARC header (§09) // → Thread-local (§10) // (added in later sections) // Phase 6: Collection specialization (§11) // (added in later sections) plan } -
Modify
TypeLayoutResolverinori_llvmto accept&ReprPlan:- Currently:
TypeLayoutResolver::new(store, scx, interner)wherestore: &TypeInfoStore,scx: &SimpleCx,interner: Option<&StringInterner>→ readsTypeInfofrom store (which readsTagfromPool) - Target:
TypeLayoutResolver::new(store, scx, interner, repr_plan)→ readsMachineReprfrom plan when available, falling back toTypeInfofor unoptimized types - Initially,
ReprPlanreturns canonical representations (zero behavioral change)
- Currently:
-
Wire
ReprPlanthrough the LLVM codegen entry points:- JIT path:
OwnedLLVMEvaluator::compile_module_with_tests()(inevaluator/compile.rs) createsReprPlan - AOT path: the AOT build pipeline creates
ReprPlanbefore constructingFunctionCompiler ReprPlanis passed toFunctionCompiler::new()(there is noModuleCompiler—FunctionCompileris the two-pass declare/define orchestrator)FunctionCompilerpasses it toTypeLayoutResolver
- JIT path:
01.4 ReprPlan Query Interface
File(s): compiler/ori_repr/src/query.rs
Provide ergonomic query methods that later sections will use:
Phase boundary: ori_repr must NEVER import from ori_llvm or ori_eval. LLVM-specific convenience methods (e.g., llvm_int_type(plan, idx, ctx)) belong in ori_llvm as an extension trait (impl ReprPlanExt for ReprPlan), not in ori_repr.
-
Integer width queries:
impl ReprPlan { /// Get the machine integer width for a type (defaults to I64) pub fn int_width(&self, idx: Idx) -> IntWidth { ... } // NOTE: LLVM-specific methods like `llvm_int_type(idx, ctx) -> IntType` // belong in ori_llvm (e.g., as an extension trait or helper), not in // ori_repr, since ori_repr must remain backend-independent. /// Is this type trivial (no RC needed)? pub fn is_trivial(&self, idx: Idx) -> bool { ... } /// Does this value escape its defining function? pub fn escapes(&self, func: Name, var: VarId) -> bool { ... } /// What RC strategy should be used for this allocation? pub fn rc_strategy(&self, idx: Idx) -> RcStrategy { ... } } pub enum RcStrategy { /// No RC needed (trivial or stack-promoted) None, /// Atomic RC with given header width Atomic { width: IntWidth }, /// Non-atomic RC (thread-local proven) NonAtomic { width: IntWidth }, } -
Tracing integration:
// All ReprPlan queries emit tracing events at trace level impl ReprPlan { pub fn get_repr_traced(&self, idx: Idx, pool: &Pool) -> &MachineRepr { let repr = self.get_repr(idx).unwrap_or(&self.canonical(pool.tag(idx))); tracing::trace!( type_tag = ?pool.tag(idx), repr = ?repr, "ReprPlan query" ); repr } }
01.5 Generic Type Handling
File(s): compiler/ori_repr/src/plan.rs, compiler/ori_repr/src/lib.rs
ReprPlan operates on monomorphized types only. Generic types (containing Var, BoundVar, RigidVar) cannot be mapped to concrete machine representations.
-
Enforce monomorphization precondition:
compute_repr_plan()must be called AFTER monomorphization (all type variables resolved)canonical()must assert/panic onTag::Var,Tag::BoundVar,Tag::RigidVar,Tag::Scheme,Tag::Infer- For
Tag::Named/Tag::Applied/Tag::Alias: always resolve viapool.resolve_fully()first — if resolution yields a type variable, it’s a monomorphization bug
-
Handle
Option<T>andResult<T, E>generically:- After monomorphization,
Option<int>is a concrete type withTag::Optionand innerIdxpointing toTag::Int - The
canonical()function recurses:Option<int>→Enum(EnumRepr { variants: [Some(Int{I64}), None] }) - This works because Pool interning deduplicates:
Option<int>at two call sites shares the sameIdx
- After monomorphization,
-
Monomorphization boundary:
- Currently, Ori does NOT have explicit monomorphization pass — type checker infers concrete types, and Pool stores them
- The
pool.resolve_fully()chain handles substitution transparently - ReprPlan must call
pool.resolve_fully(idx)before computing canonical for ANY type to ensure all variables are resolved - If
resolve_fully()returns a variable → skip this type (it’s dead code or a typeck bug)
01.6 Salsa Integration Strategy
File(s): compiler/ori_repr/src/lib.rs, compiler/oric/src/commands/codegen_pipeline.rs
The ReprPlan must integrate with the existing Salsa-based compilation model.
-
ReprPlan is NOT a Salsa tracked struct — it is computed imperatively:
- Salsa works best for demand-driven, memoizable queries (parsing, type checking)
- ReprPlan computation is a forward pass that mutates state across multiple analysis phases (triviality → range → narrowing → layout)
- Making each phase a Salsa query would create artificial dependencies and complicate the multi-pass mutation pattern
- Instead: compute ReprPlan once, pass it as
&ReprPlanto codegen (same model as howTypeInfoStoreworks today)
-
Invalidation model:
- ReprPlan is invalidated when the Pool changes (new/modified types)
- In the current compilation model, this means: recompute ReprPlan on every compilation
- Future optimization: if Pool didn’t change (Salsa cache hit on type checking), reuse previous ReprPlan
- This can be implemented as a Salsa query that takes Pool hash → ReprPlan, memoized by Pool identity
-
JIT hot-reload compatibility:
- JIT recompiles individual functions — the ReprPlan for unchanged functions is stable
- When a function’s type signature changes, only that function’s entries need recomputation
- For now: recompute entire ReprPlan per JIT invocation (same as TypeInfoStore today)
- Future: incremental ReprPlan updates keyed by function-level Merkle hashes
-
Thread safety:
- ReprPlan is immutable after computation —
&ReprPlanisSend + Sync - No interior mutability needed (unlike TypeInfoStore which uses RefCell for lazy population)
- All analysis passes write to a
&mut ReprPlanduring computation, then freeze it for codegen
- ReprPlan is immutable after computation —
01.7 #repr Attribute Integration
File(s): compiler/ori_repr/src/repr_attrs.rs
The spec (Clause 26 — FFI) defines layout attributes that override the canonical representation:
#repr("c")— C-compatible layout, no field reordering#repr("packed")— No padding, alignment = 1#repr("transparent")— Same layout as single field (newtypes)#repr("aligned", N)— Minimum N-byte alignment (power of two)
These must be threaded into ReprPlan to prevent optimizations from violating user intent.
-
Define
ReprAttributeenum:#[derive(Debug, Clone, PartialEq, Eq, Hash)] pub enum ReprAttribute { /// Default Ori layout — field reordering and narrowing permitted Default, /// C-compatible layout — declaration order, platform ABI alignment C, /// No padding — alignment = 1, may require unaligned loads Packed, /// Transparent — same layout as the single field Transparent, /// Minimum alignment (power of two), may combine with C Aligned(u32), /// C + Aligned combined (#repr("c") + #repr("aligned", N)) CAligned(u32), } -
Store
ReprAttributeper struct/enum in ReprPlan (already included in theReprPlanstruct definition in §01.2):/// Per-type #repr attributes (only for structs/enums with explicit attrs) repr_attrs: FxHashMap<Idx, ReprAttribute>, -
Gate optimization passes on
ReprAttribute:ReprAttribute::C→ §06 field reordering DISABLED, §04 field narrowing DISABLEDReprAttribute::Packed→ §06 padding DISABLED, alignment = 1ReprAttribute::Transparent→ struct is erased to its single field’s MachineReprReprAttribute::Aligned(N)→ struct alignment ≥ N (overrides computed alignment)ReprAttribute::Default→ all optimizations permitted
-
Parse
#reprfrom the IR and populate duringpopulate_canonical():- The parser already stores
#reprattributes on struct declarations - During canonical population, read the attribute and store in
repr_attrs - Validate:
#repr("transparent")requires exactly one non-ZST field - Validate:
#repr("aligned", N)requires N is a power of two - Validate:
#repr("packed")cannot combine with#repr("aligned", N)or#repr("c")
- The parser already stores
01.8 Migration Strategy: TypeInfoStore → ReprPlan
File(s): compiler/ori_llvm/src/codegen/type_info/store.rs, compiler/ori_llvm/src/codegen/type_info/info.rs
The existing TypeInfoStore and TypeInfo enum must coexist with ReprPlan during migration. The goal is gradual adoption, not a big-bang replacement.
-
Phase A — Parallel operation (§01 scope):
TypeLayoutResolveraccepts optional&ReprPlan- When
ReprPlanisSome, consult it first; if no decision exists for a type, fall back toTypeInfoStore - When
ReprPlanisNone(e.g., in tests that don’t create one), useTypeInfoStoreexclusively - This ensures zero behavioral change: ReprPlan returns canonical representations, which match TypeInfoStore exactly
-
Phase B — Triviality unification (§02 scope):
TypeInfoStore::is_trivial()delegates toReprPlan::is_trivial()when availableTypeInfoStore::classify_trivial()becomes dead code and is removedtriviality_cacheandclassifying_trivialfields removed from TypeInfoStore
-
Phase C — Full migration (§06/§07 scope):
TypeLayoutResolver::storage_type()reads fromReprPlanfor ALL typesTypeInfoStore::compute_type_info_inner()is no longer called from production codeTypeInfoenum is retained only as a compatibility adapter for tests that don’t use ReprPlan- Eventually,
TypeInfobecomes#[cfg(test)]only
-
Validation at each phase:
- Phase A:
assert_eq!(repr_plan.canonical(tag).to_llvm_type(), type_info.storage_type())for all types - Phase B: same assertion +
assert_eq!(repr_plan.is_trivial(idx), type_info_store.is_trivial(idx)) - Phase C: remove TypeInfoStore from production; tests use ReprPlan directly
- Phase A:
01.9 Canonical Representation Tests
File(s): compiler/ori_repr/src/tests.rs (sibling to lib.rs — #[cfg(test)] mod tests; declaration in lib.rs, no inline test modules)
Canonical representations are the foundation — if they’re wrong, every optimization built on them is wrong.
-
Primitive roundtrip test: For each of the 12 primitive Tags (Int, Float, Bool, Str, Char, Byte, Unit, Never, Duration, Size, Ordering, Error), verify
canonical()produces the expected MachineRepr variant. -
Composite type tests:
Option<int>→Enumwith 2 variants, inner isInt { I64, true }Result<int, str>→Enumwith 2 variants(int, bool)→Tuplewith 2 elements[int]→FatPointer(Collection { Int { I64, true } }){str: int}→FatPointer(Collection { ... })Set<int>→FatPointer(Collection { Int { I64, true } })
-
Named type resolution test: Create a
Namedtype pointing to aStruct, verifycanonical()resolves through to the struct’s repr. -
Storage type equivalence test: For a Pool containing a representative sample of all constructible types, verify that
canonical(tag).to_llvm_type(ctx)produces the same LLVM type as the existingTypeInfo::storage_type(). This is the gold standard: new system must match old system exactly before any optimizations run. -
Error on unresolved types test: Verify that
canonical()onTag::Var,Tag::BoundVar,Tag::RigidVar,Tag::Scheme,Tag::Infer,Tag::SelfTypepanics or returns an error. -
FatPointer layout test: Verify
FatRepr::StrandFatRepr::Collectionboth produce{i64, i64, ptr}in LLVM, matching the existing collection layout.
01.10 Completion Checklist
-
ori_reprcrate compiles withcargo check -p ori_repr -
#![deny(unsafe_code)]inori_repr/src/lib.rs(pure analysis crate — no unsafe needed) -
//!module doc on every.rsfile inori_repr/src/(required by hygiene rules) -
///doc on allpubtypes and functions (required by hygiene rules) - No production source file exceeds 500 lines (tests.rs exempt)
- Tests in sibling
tests.rswith#[cfg(test)] mod tests;inlib.rs— no inline test modules -
MachineReprenum has variants for ALL type kinds: Int, Float, Bool, Char, Byte, Duration, Size, Ordering, Unit, Never, Struct, Enum, Tuple, RcPointer, FatPointer, Closure, Range, StackPromoted, OpaquePtr -
ReprPlanpopulates canonical representations for all reachableTagvariants:- Primitives (12): Int, Float, Bool, Str, Char, Byte, Unit, Never, Error, Duration, Size, Ordering
- Simple containers (7): List, Option, Set, Channel, Range, Iterator, DoubleEndedIterator
- Two-child (3): Map, Result, Borrowed (reserved)
- Complex (4): Function, Tuple, Struct, Enum
- Named (3): Named, Applied, Alias (resolve-through)
- Variables (3): Var, BoundVar, RigidVar (must be resolved or error)
- Scheme/Special (5): Scheme, Projection, ModuleNs, Infer, SelfType (error if reached)
-
#reprattributes (c, packed, transparent, aligned) are parsed and stored in ReprPlan - Generic types handled correctly: all type variables resolved before canonical computation
- Salsa integration: ReprPlan computed imperatively, passed as
&ReprPlanto codegen - Migration Phase A complete: TypeLayoutResolver accepts optional ReprPlan, falls back to TypeInfoStore
-
TypeLayoutResolverinori_llvmreads fromReprPlaninstead of hardcodedTag → LLVMmap - Storage type equivalence test passes: canonical representations match existing TypeInfo for all types
-
./test-all.shgreen — zero behavioral changes (canonical reprs match existing hardcoded ones) -
./clippy-all.shgreen - Tracing output shows
ReprPlan queryevents atORI_LOG=ori_repr=trace - No regressions in
./llvm-test.shorcargo st
Exit Criteria: ori_repr crate exists, ReprPlan is threaded through the entire LLVM codegen pipeline, all existing tests pass with identical behavior, and ORI_LOG=ori_repr=trace ori build tests/benchmarks/bench_small.ori shows ReprPlan query events for every type in the program.