Annex E (informative) — System considerations

This annex describes implementation considerations for different target platforms and optimization levels.

This section specifies implementation-level requirements and platform considerations.

Numeric Types

Integers

The int type is a signed integer with the following semantic range:

PropertyValue
Canonical size64 bits
Minimum-9,223,372,036,854,775,808 (-2⁶³)
Maximum9,223,372,036,854,775,807 (2⁶³ - 1)
OverflowPanics (see Error Codes)

The canonical size defines the semantic range. The compiler may use a narrower machine representation (see § Representation Optimization).

There is no separate unsigned integer type. Bitwise operations treat the value as unsigned bits.

Floats

The float type is an IEEE 754 double-precision floating-point number:

PropertyValue
Canonical size64 bits
Precision~15-17 significant decimal digits
Range±1.7976931348623157 × 10³⁰⁸

The canonical size defines the semantic precision. The compiler may use a narrower machine representation when it can prove no precision loss (see § Representation Optimization).

Special values inf, -inf, and nan are supported.

Strings

Encoding

All strings are UTF-8 encoded. There is no separate ASCII or byte-string type.

let greeting = "Hello, 世界";  // UTF-8
let emoji = "🎉";              // UTF-8

Indexing

String indexing returns a single Unicode codepoint as a str:

let s = "héllo";
s[0];  // "h"
s[1]  // "é" (single codepoint)

The index refers to codepoint position, not byte position. Out-of-bounds indexing panics.

Grapheme Clusters

Some visual characters consist of multiple codepoints:

let astronaut = "🧑‍🚀";  // 3 codepoints: person + ZWJ + rocket
len(astronaut);        // 3
astronaut[0]          // "🧑"

For grapheme-aware operations, use standard library functions.

Length

len(str) returns the number of bytes, not codepoints. Use .chars().count() for codepoint count.

len("hello")  // 5 (5 bytes)
len("世界")    // 6 (each character is 3 UTF-8 bytes)
len("🧑‍🚀")    // 11 (multi-byte emoji ZWJ sequence: 4+3+4)

Collections

Limits

Collections have no fixed size limits. Maximum size is bounded by available memory.

CollectionLimit
ListMemory
MapMemory
StringMemory

Capacity

Implementations may pre-allocate capacity for performance. This is not observable behavior.

Recursion

Tail Call Optimization

Tail calls are guaranteed to be optimized. A tail call does not consume stack space:

@countdown (n: int) -> void =
    if n <= 0 then void else countdown(n: n - 1);  // tail call

countdown(n: 1000000)  // does not overflow stack

A call is in tail position if it is the last operation before the function returns.

Non-Tail Recursion

Non-tail recursive calls consume stack space. Deep recursion may cause stack overflow:

@sum_to (n: int) -> int =
    if n <= 0 then 0 else n + sum_to(n: n - 1);  // not tail call

sum_to(n: 1000000)  // may overflow stack

For deep recursion, use the recurse pattern with memo: true or convert to tail recursion.

Platform Support

Target Platforms

Conforming implementations should support:

  • Linux (x86-64, ARM64)
  • macOS (x86-64, ARM64)
  • Windows (x86-64)
  • WebAssembly (WASM)

Endianness

Byte order is implementation-defined. Programs should not depend on endianness unless using platform-specific byte manipulation.

Path Separators

File paths use the platform-native separator. The standard library provides cross-platform path operations.

Implementation Limits

Implementations may impose limits on:

AspectMinimum Required
Identifier length1024 characters
Nesting depth256 levels
Function parameters255
Generic parameters64

Exceeding these limits is a compile-time error.

Representation Optimization

The compiler may optimize the machine representation of any type, provided the optimization preserves semantic equivalence. An optimization is semantically equivalent if no conforming program can distinguish the optimized representation from the canonical one through any language-level operation.

Canonical Representations

TypeCanonicalSemantic Range
int64-bit signed two’s complement[-2⁶³, 2⁶³ - 1]
float64-bit IEEE 754 binary64±1.8 × 10³⁰⁸, ~15-17 digits
bool1-bittrue or false
byte8-bit unsigned[0, 255]
char32-bit Unicode scalarU+0000–U+10FFFF excluding surrogates
OrderingTri-stateLess, Equal, Greater

Permitted Optimizations

Permitted optimizations include but are not limited to:

  • Narrowing primitive machine types (booli1, bytei8, chari32, Orderingi8)
  • Enum discriminant narrowing (i8 for ≤256 variants)
  • All-unit enum payload elimination
  • Sum type shared payload slots (Result<T, E> uses max(sizeof(T), sizeof(E)))
  • ARC operation elision for transitively trivial types
  • Newtype representation erasure
  • Struct field reordering for alignment
  • Integer narrowing based on value range analysis
  • Float narrowing when precision loss is provably zero

Guarantees

  1. The semantic range of every type is always preserved
  2. Overflow behavior is determined by the semantic type, not the machine representation
  3. Values stored and retrieved through any language operation are identical
  4. debug() and print() display semantic values
  5. x == y and hash(x) == hash(y) relationships are representation-independent
  6. Type classification for reference counting is determined by type containment, not representation size (see Memory Model § Type Classification)

Non-Guarantees

  1. The exact machine representation of any type is unspecified
  2. Memory layout may differ between compiler versions and target platforms
  3. Struct field order in memory may differ from declaration order

NOTE For the full specification including optimization tiers, cross-cutting invariants, and interaction with #repr attributes, see Representation Optimization Proposal.

ARC Runtime

This section specifies the runtime support for reference-counted heap objects in AOT-compiled programs.

NOTE The ARC runtime ABI is not stable. Heap object layout and runtime function signatures may change between compiler versions. This section applies to the AOT compilation target only; the interpreter and JIT may use different representations.

Heap Object Layout

A reference-counted heap object has the following layout:

+──────────────────+───────────────────────────+
| strong_count: i64 | data bytes ...           |
+──────────────────+───────────────────────────+
^                    ^
base (data_ptr - 8)  data_ptr

The data_ptr returned by allocation points to the data area, not to the header. The strong count is stored at data_ptr - 8. Minimum alignment is 8 bytes.

The data pointer may be passed to foreign functions without adjustment.

Runtime Functions

All runtime functions use the C calling convention (extern "C").

FunctionSignatureDescription
ori_rc_alloc(size: usize, align: usize) -> *mut u8Allocate size + 8 bytes, initialize strong count to 1, return data pointer
ori_rc_inc(data_ptr: *mut u8)Increment the strong count
ori_rc_dec(data_ptr: *mut u8, drop_fn: fn(*mut u8))Decrement the strong count; if zero, call drop_fn
ori_rc_free(data_ptr: *mut u8, size: usize, align: usize)Deallocate from data_ptr - 8 with total size size + 8
ori_rc_count(data_ptr: *const u8) -> i64Return the current strong count (diagnostic use only)

Drop Functions

Each reference type has a compiler-generated drop function with signature extern "C" fn(*mut u8). The drop function:

  1. Decrements reference counts of any reference-typed child fields (calling ori_rc_dec for each)
  2. Calls ori_rc_free(data_ptr, size, align) to release the allocation

If the type implements the Drop trait, Drop.drop is called before step 1.

Built-in Type Representations

TypeRepresentation
str{ len: i64, data: *const u8 }
[T]{ len: i64, cap: i64, data: *mut u8 }
Option<T>{ tag: i8, value: T } (tag 0 = None, 1 = Some)
Result<T, E>{ tag: i8, value: max(T, E) } (tag 0 = Ok, 1 = Err)