Type Registry
What Is a Type Registry?
A type pool stores structural types — [int], (str) -> bool, (int, float). But a real programming language also has named types: type Point = { x: int, y: int }, type Option<T> = Some(T) | None. These user-defined types need a registry that maps names to definitions, tracks fields and variants, and provides lookup for the type checker during inference.
Beyond types, most languages have traits (or interfaces, typeclasses, protocols) — named collections of methods that types can implement. And methods — both inherent (impl Point { @distance ... }) and trait-based (impl Point: Printable { @to_str ... }) — need a resolution system that finds the right method for a given receiver type.
The type registry is the data structure that organizes all of this: type definitions, trait definitions, trait implementations, and method resolution. It is populated during Pass 0 of type checking and consulted throughout the remaining passes.
Nominal vs. Structural Typing
A fundamental decision in type system design is whether type identity is determined by name or by structure.
Structural typing (TypeScript, Go interfaces, OCaml structural objects): two types are compatible if they have the same shape. A function expecting { x: int, y: int } accepts any value with fields x: int and y: int, regardless of what the type is called. This is flexible but makes it hard to distinguish semantically different types with the same shape — is { cents: int } a price or a distance?
Nominal typing (Java, Rust, Ori): two types are compatible only if they have the same name. type Price = { cents: int } and type Distance = { cents: int } are different types even though they have identical fields. This is more restrictive but provides stronger guarantees about intent.
Ori uses nominal typing for all user-defined types. Even newtypes — type UserId = int — create a distinct type that is not interchangeable with the underlying representation. The .inner field provides access to the underlying value, and explicit construction (UserId(42)) is required. This trades convenience for safety: you cannot accidentally pass a Temperature where a Count is expected, even though both are int underneath.
Traits as Bounded Polymorphism
Traits serve two purposes in Ori’s type system:
-
Ad-hoc polymorphism — different types implement the same interface differently.
Printablemeans “has ato_strmethod,” but integers and lists implement it differently. -
Bounded generics — constraining type parameters:
@sort<T: Comparable>(list: [T]) -> [T]meansTmust support comparison. Without the bound, the function body couldn’t call comparison operators onT.
This is the typeclass pattern from Haskell (Wadler and Blott, 1989), adapted through Rust’s trait system. The key design choice is explicit implementation — a type doesn’t satisfy a trait just because it has the right methods. It must explicitly impl Type: Trait, making the relationship visible in the source code.
Three Registries
Ori uses three cooperating registries, each with a focused responsibility:
flowchart TB
typereg["TypeRegistry
struct/enum/newtype/alias
definitions + variant lookup"]
traitreg["TraitRegistry
trait definitions + impls
supertrait chains + object safety"]
methodreg["MethodRegistry
unified method lookup
(built-in + inherent + trait)"]
pool["Pool
type storage"]
methodreg --> traitreg
typereg --> pool
traitreg --> pool
classDef frontend fill:#1e3a5f,stroke:#60a5fa,color:#dbeafe
class typereg,traitreg,methodreg,pool frontend
TypeRegistry — User-Defined Types
pub struct TypeRegistry {
types_by_name: BTreeMap<Name, TypeEntry>, // Deterministic iteration
types_by_idx: FxHashMap<Idx, TypeEntry>, // Fast lookup by type Idx
variants_by_name: FxHashMap<Name, (Idx, usize)>, // Variant → (enum Idx, variant index)
}
BTreeMap for types_by_name ensures deterministic iteration order — important for reproducible error messages and stable test output. FxHashMap for types_by_idx provides fast lookup by pool index, the more common access pattern during inference.
TraitRegistry — Traits and Implementations
pub struct TraitRegistry {
traits: Vec<TraitEntry>,
traits_by_name: BTreeMap<Name, usize>,
traits_by_idx: FxHashMap<Idx, usize>,
impls: Vec<ImplEntry>,
impls_by_type: FxHashMap<Idx, Vec<usize>>, // Self type → impl indices
impls_by_trait: FxHashMap<Idx, Vec<usize>>, // Trait → impl indices
}
The dual indexing (impls_by_type and impls_by_trait) enables efficient lookup from both directions: “what traits does Point implement?” and “what types implement Printable?”
MethodRegistry — Unified Method Dispatch
pub struct MethodRegistry;
Currently a thin wrapper that delegates to TraitRegistry for trait-based method lookup. Built-in method resolution is handled separately by resolve_builtin_method() in the inference engine. The registry exists as an architectural placeholder for future unification of all method lookup paths into a single dispatch mechanism.
Type Definitions
TypeEntry
pub struct TypeEntry {
pub name: Name,
pub idx: Idx,
pub kind: TypeKind,
pub span: Span,
pub type_params: Vec<Name>,
pub visibility: Visibility,
}
Each entry carries the type’s name, its pool index, its kind (struct, enum, newtype, or alias), its source location for error reporting, generic type parameters, and visibility.
TypeKind
pub enum TypeKind {
Struct(StructDef),
Enum { variants: Vec<VariantDef> },
Newtype { underlying: Idx },
Alias { target: Idx },
}
Struct — a product type with named fields. StructDef contains Vec<FieldDef> where each field has a name, type Idx, span, and visibility. Fields can be individually public or private.
Enum — a sum type with named variants. Each VariantDef has a name and VariantFields:
pub enum VariantFields {
Unit, // None, True — no data
Tuple(Vec<Idx>), // Some(int) — positional fields
Record(Vec<FieldDef>), // Point(x: int, y: int) — named fields
}
Newtype — a distinct type wrapping an existing type. type UserId = int creates a Newtype with underlying = Idx::INT. The newtype has nominal identity (it’s not int), zero runtime cost (same representation), and provides .inner for unwrapping.
Alias — a transparent type synonym. Unlike newtypes, aliases are interchangeable with their target — type Ints = [int] means Ints and [int] are the same type. Aliases exist for readability, not for type safety.
Visibility
pub enum Visibility {
Public, // Visible everywhere
Private, // Visible only within the defining module
}
The default is Public. Types, fields, methods, and trait implementations each carry their own Visibility. A public struct can have private fields — the struct is constructible from other modules, but private fields must go through constructor functions.
Registration
Types are registered during Pass 0 of module checking. The caller creates the type in the pool first (obtaining an Idx), then registers the entry under both name and index:
impl TypeRegistry {
pub fn register_struct(&mut self, name: Name, idx: Idx, type_params: Vec<Name>,
fields: Vec<FieldDef>, span: Span, visibility: Visibility);
pub fn register_enum(&mut self, name: Name, idx: Idx, type_params: Vec<Name>,
variants: Vec<VariantDef>, span: Span, visibility: Visibility);
pub fn register_newtype(&mut self, name: Name, idx: Idx, type_params: Vec<Name>,
underlying: Idx, span: Span, visibility: Visibility);
pub fn register_alias(&mut self, name: Name, idx: Idx, type_params: Vec<Name>,
target: Idx, span: Span, visibility: Visibility);
}
Enum registration also populates variants_by_name — a reverse index from variant name to (enum Idx, variant position). This enables O(1) constructor resolution: when the type checker sees Some(42), it looks up Some in variants_by_name to find that it belongs to Option at variant index 0.
Lookup
impl TypeRegistry {
pub fn get_by_name(&self, name: Name) -> Option<&TypeEntry>;
pub fn get_by_idx(&self, idx: Idx) -> Option<&TypeEntry>;
pub fn contains(&self, name: Name) -> bool;
pub fn lookup_variant(&self, variant_name: Name) -> Option<(Idx, usize)>;
}
The variant lookup is critical for pattern matching. When the type checker processes match value { Some(n) -> ..., None -> ... }, it looks up each variant name to determine which enum it belongs to and what its discriminant index is (needed by LLVM codegen for switch instructions).
Trait Definitions
TraitEntry
pub struct TraitEntry {
pub name: Name,
pub idx: Idx,
pub type_params: Vec<Name>,
pub methods: FxHashMap<Name, TraitMethodDef>,
pub assoc_types: FxHashMap<Name, TraitAssocTypeDef>,
pub super_traits: Vec<Idx>,
pub object_safety_violations: Vec<ObjectSafetyViolation>,
pub span: Span,
}
super_traits lists the trait’s supertrait Idx values. Comparable has Eq as a supertrait — implementing Comparable requires also implementing Eq. The TraitRegistry provides all_super_traits(), which performs a transitive BFS walk to collect the full supertrait closure, used to verify that an impl satisfies all inherited obligations.
assoc_types tracks associated type declarations. A trait like Iterator has type Item — the type of elements the iterator produces. Implementations must specify the concrete type: impl Range: Iterator { type Item = int }.
Object Safety
Not every trait can be used as a trait object (dynamic dispatch via Trait as a type). The object_safety_violations field records why:
pub enum ObjectSafetyViolation {
SelfReturn { method: Name, span: Span }, // Returns Self
SelfParam { method: Name, span: Span }, // Self as non-receiver param
GenericMethod { method: Name, span: Span }, // Method has type params
}
A trait is object-safe if it has no violations. Examples:
- Object-safe:
Printable(@to_str(self) -> str),Debug,Hashable - Not object-safe:
Clone(@clone(self) -> Self— returnsSelf),Eq(@equals(self, other: Self)—Selfin non-receiver position),Iterator(associated typeItem)
Object safety is checked at trait definition time and cached in the entry, so later code can check object safety without re-analyzing the trait’s methods.
Default Type Parameters
Traits may have type parameters with defaults:
trait Add<Rhs = Self> {
type Output = Self
@add (self, rhs: Rhs) -> Self.Output
}
Default types are stored as ParsedType (not Idx) because Self must be resolved at impl registration time, not at trait definition time. When impl Add for int doesn’t specify Rhs, the default Self resolves to int.
Default Associated Types
When an impl omits an associated type that has a default, the default is used with Self resolved to the implementing type:
trait Add<Rhs = Self> {
type Output = Self // default: Output is Self
@add (self, rhs: Rhs) -> Self.Output
}
impl Add for int {
// Output defaults to int (Self = int)
@add (self, rhs: int) -> int = { ... }
}
Trait Implementations
ImplEntry
pub struct ImplEntry {
pub trait_idx: Option<Idx>, // None for inherent impls
pub self_type: Idx,
pub type_params: Vec<Name>,
pub methods: FxHashMap<Name, ImplMethodDef>,
pub assoc_types: FxHashMap<Name, Idx>,
pub where_clause: Vec<WhereConstraint>,
pub span: Span,
}
trait_idx: None marks inherent impls (impl Point { ... }) — methods defined directly on a type without going through a trait. Inherent methods take priority over trait methods in resolution.
Impl Lookup
impl TraitRegistry {
pub fn impls_for_type(&self, idx: Idx) -> &[usize];
pub fn impls_of_trait(&self, idx: Idx) -> &[usize];
pub fn get_trait_impl(&self, trait_name: Name, self_type: Idx) -> Option<&ImplEntry>;
}
The dual indexing supports both lookup directions efficiently:
- “Does Point implement Printable?” —
impls_for_type(Point_idx), then filter forPrintable - “What types implement Printable?” —
impls_of_trait(Printable_idx)
Method Resolution Order
When a method is called on a value, the type checker searches three layers in priority order:
- Built-in methods — Compiler-defined methods on primitive and container types. Dispatched on type tag + method name via
resolve_builtin_method(). Theori_registry::BUILTIN_TYPESarray serves as the manifest — each entry is aTypeDefcontaining the type name, its tag, and the full list of methods with their signatures and metadata:
pub static BUILTIN_TYPES: &[TypeDef] = &[
TypeDef { name: "Iterator", tag: TypeTag::Iterator, methods: &[
MethodDef { name: "all", ... },
MethodDef { name: "any", ... },
// ...
]},
// ... all built-in types with their methods
];
This registry is the single source of truth for all built-in methods across the type checker, evaluator, and LLVM backend. Sync tests in each consumer verify coverage against it.
-
Inherent methods —
impl Type { ... }blocks. These are the type’s “own” methods, not associated with any trait. -
Trait methods —
impl Type: Trait { ... }blocks. When multiple traits provide a method with the same name, the caller must use qualified syntax to disambiguate:Trait.method(value).
This ordering follows the principle that more specific bindings take priority. A type’s own methods shadow trait methods, and built-in methods (which are the primary interface for primitives) take highest priority.
Built-in Method Resolution
Built-in method resolution uses ori_registry::BUILTIN_TYPES as the single source of truth. The type checker’s resolve_builtin_method() function looks up (TypeTag, method_name) pairs in the registry to determine parameter types, return types, and generic instantiation. Per-type resolver functions (e.g., resolve_str_method()) were eliminated during the Type Strategy Registry migration — all built-in method metadata now lives in ori_registry.
Registration Order
The registration order during Pass 0 is load-bearing — dependencies must be resolved before dependents:
flowchart TB
builtin["1. Built-in types
Ordering, etc."] --> user
user["2. User-defined types
struct/enum/newtype/alias"] --> traits
traits["3. Trait definitions
methods + associated types"] --> impls
impls["4. Implementations
impl blocks linking types to traits"] --> derived
derived["5. Derived implementations
from #derive attributes"] --> config
config["6. Config variables
let $VAR constants"]
classDef frontend fill:#1e3a5f,stroke:#60a5fa,color:#dbeafe
class builtin,user,traits,impls,derived,config frontend
This ordering ensures:
- Trait definitions can reference user-defined types (traits come after types)
- Trait implementations can reference trait definitions (impls come after traits)
- Derived implementations can reference both user types and trait definitions (derived comes after impls)
- Config variables can reference any registered type (config comes last)
Error Suggestions
The registries support “did you mean?” suggestions via Levenshtein distance:
impl TypeRegistry {
pub fn suggest_similar(&self, name: Name) -> Vec<Name>;
}
When a user writes Optin<int>, the registry searches types_by_name for names within edit distance 2, finding Option as a suggestion. This produces actionable error messages: “unknown type Optin; did you mean Option?”
Prior Art
Haskell Typeclasses
Haskell’s typeclass system (Wadler and Blott, “How to make ad-hoc polymorphism less ad hoc”, 1989) established the pattern of explicit trait-like declarations with instance implementations. GHC’s typeclass resolution, including superclass constraints and default methods, directly influenced Ori’s trait system. The key difference is that Ori’s traits support Self return types (like Rust) while Haskell’s typeclasses traditionally use the type variable approach.
Rust Traits
Rust’s trait system is the most direct influence on Ori’s design. Rust’s resolution order (inherent > trait), object safety rules, associated types, and default method implementations all appear in Ori with minor variations. The main simplification is that Ori doesn’t have lifetime parameters or borrow checking, which significantly reduces the complexity of trait resolution.
Swift Protocols
Swift’s protocol system uses a similar nominal approach with explicit conformance declarations. Swift’s protocol witness tables (the runtime mechanism for dynamic dispatch) influenced the design discussion for Ori’s trait object representation, though Ori currently uses a simpler vtable approach.
Go Interfaces (Contrast)
Go’s interface system uses structural conformance — any type that has the right methods automatically satisfies the interface, without explicit declaration. This is more convenient but makes it harder to reason about intent. Ori chose explicit conformance (like Rust) for clarity: if a type implements Comparable, it’s because the author intentionally decided it has a meaningful total ordering, not because it happens to have a method named compare.
Design Tradeoffs
BTreeMap vs. FxHashMap for type names. BTreeMap provides deterministic iteration (alphabetical by name), which makes error messages and test output reproducible. FxHashMap would be faster for lookup-only workloads but would give non-deterministic iteration. Since type registration happens once and lookup happens many times, the BTreeMap overhead for registration is negligible, and lookup is still O(log n) — fast enough for the ~50–200 types in a typical module.
Variant-level indexing. The variants_by_name map enables O(1) constructor resolution but requires that variant names are unique across all enums in a module. This is a deliberate constraint — Ori does not allow two enums to have variants with the same name (unlike Rust, where each enum has its own namespace). The benefit is simpler resolution; the cost is that generic variant names like Error can only belong to one enum.
Separate registries vs. unified storage. Three registries (TypeRegistry, TraitRegistry, MethodRegistry) add complexity compared to a single unified registry. The separation keeps each registry focused — type lookup doesn’t need to filter out traits, and trait lookup doesn’t need to handle type alias resolution. This aligns with the single-responsibility principle but means cross-registry queries (e.g., “find all traits implemented by this type”) require coordination between registries.
Object safety at definition time. Checking object safety when the trait is defined (rather than when it’s used as a trait object) means violations are reported early, but it also means the check runs even for traits that are never used as trait objects. The early check is a small cost (one analysis per trait definition) for a significant UX benefit (errors at the definition site, not at distant use sites).
Explicit vs. implicit trait implementation. Requiring explicit impl Type: Trait is more verbose than Go’s implicit conformance but prevents accidental satisfaction. A type with a to_str method doesn’t automatically become Printable — the author must explicitly declare the intent. This catches bugs where a method happens to have the right name but the wrong semantics.