Annex E (informative) — System considerations

This annex describes implementation considerations for different target platforms and optimization levels.

This section specifies implementation-level requirements and platform considerations.

Numeric Types

Integers

The int type is a signed integer with the following semantic range:

Property	Value
Canonical size	64 bits
Minimum	-9,223,372,036,854,775,808 (-2⁶³)
Maximum	9,223,372,036,854,775,807 (2⁶³ - 1)
Overflow	Panics (see Error Codes)

The canonical size defines the semantic range. The compiler may use a narrower machine representation (see § Representation Optimization).

There is no separate unsigned integer type. Bitwise operations treat the value as unsigned bits.

Floats

The float type is an IEEE 754 double-precision floating-point number:

Property	Value
Canonical size	64 bits
Precision	~15-17 significant decimal digits
Range	±1.7976931348623157 × 10³⁰⁸

The canonical size defines the semantic precision. The compiler may use a narrower machine representation when it can prove no precision loss (see § Representation Optimization).

Special values inf, -inf, and nan are supported.

Strings

Encoding

All strings are UTF-8 encoded. There is no separate ASCII or byte-string type.

let greeting = "Hello, 世界";  // UTF-8
let emoji = "🎉";              // UTF-8

Indexing

String indexing returns a single Unicode codepoint as a str:

let s = "héllo";
s[0];  // "h"
s[1]  // "é" (single codepoint)

The index refers to codepoint position, not byte position. Out-of-bounds indexing panics.

Grapheme Clusters

Some visual characters consist of multiple codepoints:

let astronaut = "🧑‍🚀";  // 3 codepoints: person + ZWJ + rocket
len(astronaut);        // 3
astronaut[0]          // "🧑"

For grapheme-aware operations, use standard library functions.

Length

len(str) returns the number of bytes, not codepoints. Use .chars().count() for codepoint count.

len("hello")  // 5 (5 bytes)
len("世界")    // 6 (each character is 3 UTF-8 bytes)
len("🧑‍🚀")    // 11 (multi-byte emoji ZWJ sequence: 4+3+4)

Collections

Limits

Collections have no fixed size limits. Maximum size is bounded by available memory.

Collection	Limit
List	Memory
Map	Memory
String	Memory

Capacity

Implementations may pre-allocate capacity for performance. This is not observable behavior.

Recursion

Tail Call Optimization

Tail calls are guaranteed to be optimized. A tail call does not consume stack space:

@countdown (n: int) -> void =
    if n <= 0 then void else countdown(n: n - 1);  // tail call

countdown(n: 1000000)  // does not overflow stack

A call is in tail position if it is the last operation before the function returns.

Non-Tail Recursion

Non-tail recursive calls consume stack space. Deep recursion may cause stack overflow:

@sum_to (n: int) -> int =
    if n <= 0 then 0 else n + sum_to(n: n - 1);  // not tail call

sum_to(n: 1000000)  // may overflow stack

For deep recursion, use the recurse pattern with memo: true or convert to tail recursion.

Platform Support

Target Platforms

Conforming implementations should support:

Linux (x86-64, ARM64)
macOS (x86-64, ARM64)
Windows (x86-64)
WebAssembly (WASM)

Endianness

Byte order is implementation-defined. Programs should not depend on endianness unless using platform-specific byte manipulation.

Path Separators

File paths use the platform-native separator. The standard library provides cross-platform path operations.

Implementation Limits

Implementations may impose limits on:

Aspect	Minimum Required
Identifier length	1024 characters
Nesting depth	256 levels
Function parameters	255
Generic parameters	64

Exceeding these limits is a compile-time error.

Representation Optimization

The compiler may optimize the machine representation of any type, provided the optimization preserves semantic equivalence. An optimization is semantically equivalent if no conforming program can distinguish the optimized representation from the canonical one through any language-level operation.

Canonical Representations

Type	Canonical	Semantic Range
`int`	64-bit signed two’s complement	[-2⁶³, 2⁶³ - 1]
`float`	64-bit IEEE 754 binary64	±1.8 × 10³⁰⁸, ~15-17 digits
`bool`	1-bit	`true` or `false`
`byte`	8-bit unsigned	[0, 255]
`char`	32-bit Unicode scalar	U+0000–U+10FFFF excluding surrogates
`Ordering`	Tri-state	`Less`, `Equal`, `Greater`

Permitted Optimizations

Permitted optimizations include but are not limited to:

Narrowing primitive machine types (bool → i1, byte → i8, char → i32, Ordering → i8)
Enum discriminant narrowing (i8 for ≤256 variants)
All-unit enum payload elimination
Sum type shared payload slots (Result<T, E> uses max(sizeof(T), sizeof(E)))
ARC operation elision for transitively trivial types
Newtype representation erasure
Struct field reordering for alignment
Integer narrowing based on value range analysis
Float narrowing when precision loss is provably zero

Guarantees

The semantic range of every type is always preserved
Overflow behavior is determined by the semantic type, not the machine representation
Values stored and retrieved through any language operation are identical
debug() and print() display semantic values
x == y and hash(x) == hash(y) relationships are representation-independent
Type classification for reference counting is determined by type containment, not representation size (see Memory Model § Type Classification)

Non-Guarantees

The exact machine representation of any type is unspecified
Memory layout may differ between compiler versions and target platforms
Struct field order in memory may differ from declaration order

NOTE For the full specification including optimization tiers, cross-cutting invariants, and interaction with #repr attributes, see Representation Optimization Proposal.

ARC Runtime

This section specifies the runtime support for reference-counted heap objects in AOT-compiled programs.

NOTE The ARC runtime ABI is not stable. Heap object layout and runtime function signatures may change between compiler versions. This section applies to the AOT compilation target only; the interpreter and JIT may use different representations.

Heap Object Layout

A reference-counted heap object has the following layout:

+──────────────────+───────────────────────────+
| strong_count: i64 | data bytes ...           |
+──────────────────+───────────────────────────+
^                    ^
base (data_ptr - 8)  data_ptr

The data_ptr returned by allocation points to the data area, not to the header. The strong count is stored at data_ptr - 8. Minimum alignment is 8 bytes.

The data pointer may be passed to foreign functions without adjustment.

Runtime Functions

All runtime functions use the C calling convention (extern "C").

Function	Signature	Description
`ori_rc_alloc`	`(size: usize, align: usize) -> *mut u8`	Allocate `size + 8` bytes, initialize strong count to 1, return data pointer
`ori_rc_inc`	`(data_ptr: *mut u8)`	Increment the strong count
`ori_rc_dec`	`(data_ptr: mut u8, drop_fn: fn(mut u8))`	Decrement the strong count; if zero, call `drop_fn`
`ori_rc_free`	`(data_ptr: *mut u8, size: usize, align: usize)`	Deallocate from `data_ptr - 8` with total size `size + 8`
`ori_rc_count`	`(data_ptr: *const u8) -> i64`	Return the current strong count (diagnostic use only)

Drop Functions

Each reference type has a compiler-generated drop function with signature extern "C" fn(*mut u8). The drop function:

Decrements reference counts of any reference-typed child fields (calling ori_rc_dec for each)
Calls ori_rc_free(data_ptr, size, align) to release the allocation

If the type implements the Drop trait, Drop.drop is called before step 1.

Built-in Type Representations

Type	Representation
`str`	`{ len: i64, data: *const u8 }`
`[T]`	`{ len: i64, cap: i64, data: *mut u8 }`
`Option<T>`	`{ tag: i8, value: T }` (tag 0 = `None`, 1 = `Some`)
`Result<T, E>`	`{ tag: i8, value: max(T, E) }` (tag 0 = `Ok`, 1 = `Err`)