Section 02: List COW Operations
Context: Currently, ori_list_push_new() and all other list mutation functions in ori_rt unconditionally allocate a new buffer, copy all elements, perform the mutation, and return the new list. This makes every push O(n). For a loop that builds a list of N elements, the total cost is O(N²). With COW, the same loop is O(N) — identical to a mutable Vec<T> in Rust.
Reference implementations:
- Swift
Array.swift:_makeMutableAndUnique()→_buffer.beginCOWMutation()→ branch. Inline fast path, outlined slow path. - Roc
list.zig:listAppend(list, elem, update_mode)—update_modeis eitherInPlaceorImmutable, determined by the compiler’s alias analysis. - Lean 4: Reset/Reuse pattern — if RC==1, reuse memory; if shared, allocate fresh.
Depends on: Section 01 (uniqueness check API, capacity management, growth strategy).
02.1 COW Push (Append)
File(s): compiler/ori_rt/src/lib.rs
Push is the highest-impact single operation. Most list construction patterns are repeated appends.
-
Replace
ori_list_push_newwith COW-awareori_list_push: (2026-02-27, implemented asori_list_push_cowwith consuming semantics — sret ABI matching current codegen pattern. Uses RC-allocated data buffers. Fast path: unique+capacity=in-place O(1), unique+growth=realloc. Slow path: shared/empty=allocate+copy+dec-old. Codegen wiring deferred to §02.7)/// Appends `elem` to `list`. If the list's data buffer is uniquely owned /// and has sufficient capacity, writes in place (O(1)). Otherwise, /// allocates a new buffer with grown capacity, copies old elements, /// writes the new element, and decrements the old buffer's RC. /// /// Returns the (possibly new) OriList by value. /// /// Element RC: The element's RC is NOT incremented — the caller is /// responsible for ensuring the element is owned (Perceus handles this). #[unsafe(no_mangle)] pub extern "C" fn ori_list_push( list: OriList, elem: *const u8, elem_size: usize, elem_align: usize, ) -> OriList { let new_len = list.len + 1; if !list.data.is_null() && ori_rc_is_unique(list.data as *const u8) { // FAST PATH: unique owner if list.cap >= new_len { // Has capacity — write in place unsafe { let dst = list.data.add((list.len as usize) * elem_size); std::ptr::copy_nonoverlapping(elem, dst, elem_size); } return OriList { len: new_len, cap: list.cap, data: list.data }; } else { // Needs growth — realloc (may extend in place) let new_cap = next_capacity(list.cap as usize, new_len as usize); let new_data = ori_rc_realloc( list.data, (list.cap as usize) * elem_size, new_cap * elem_size, elem_align, ); unsafe { let dst = new_data.add((list.len as usize) * elem_size); std::ptr::copy_nonoverlapping(elem, dst, elem_size); } return OriList { len: new_len, cap: new_cap as i64, data: new_data }; } } else { // SLOW PATH: shared or empty — allocate new let new_cap = next_capacity(0, new_len as usize); let new_data = ori_rc_alloc(new_cap * elem_size, elem_align); if !list.data.is_null() { // Copy old elements unsafe { std::ptr::copy_nonoverlapping( list.data as *const u8, new_data, (list.len as usize) * elem_size, ); } // Decrement old buffer (we no longer reference it) // NOTE: Only dec the buffer RC, NOT the elements — they're shared ori_rc_dec_no_children(list.data as *mut u8, ...); } unsafe { let dst = new_data.add((list.len as usize) * elem_size); std::ptr::copy_nonoverlapping(elem, dst, elem_size); } return OriList { len: new_len, cap: new_cap as i64, data: new_data }; } } -
Slow path element RC handling: When copying elements from a shared list to a new buffer, each element that is itself RC’d must be incremented (the new buffer now also references them). The codegen (§02.7) must emit element-wise
ori_rc_inccalls afterori_list_pushon the slow path. (2026-02-26, design documented; runtime uses byte-copy; codegen-side RC inc deferred to §02.7)Design decision — who increments element RCs:
(a) Runtime function increments elements (recommended):
- Pro: Single call from codegen, simpler emission
- Pro: Runtime knows element layout via
elem_size/elem_align - Con: Runtime needs element RC info (a function pointer or strategy enum)
(b) Codegen increments elements:
- Pro: Codegen already knows element types
- Con: More complex emission, larger generated code
- Con: Duplicated logic across operations
Recommended: Option (a) — pass a
drop_fn-style callback for element RC operations. The runtime calls it for each copied element on the slow path. This matches the existing drop function pattern inarc_emitter/drop_gen.rs. -
Thread safety note: The uniqueness check uses
Relaxedordering. This is safe because: (2026-02-26, documented inori_rc_is_uniquedoc comment — §01.1)- If RC==1, no other thread holds a reference (by definition)
- If another thread is racing with
ori_rc_inc, the value is being shared, andRelaxedmay see either 1 or 2 — if it sees 1, the inc hasn’t happened yet, so we’re still the sole owner; if it sees 2, we correctly take the slow path Release/Acquireis only needed for the dec path (to ensure visibility of writes before deallocation)
-
Unit tests (Rust): (2026-02-27, 5 tests in
ori_rt/src/tests.rs: cow_push_to_empty_sentinel, cow_push_unique_with_capacity, cow_push_unique_needs_growth, cow_push_shared_list_copies, cow_push_1000_sequential_amortized. All pass with zero leaks verified via RC_LIVE_COUNT.)- Push to empty list (sentinel) → allocates, len=1, cap=MIN_CAPACITY
- Push to unique list with capacity → in-place, same data pointer
- Push to unique list without capacity → realloc, new data pointer, doubled cap
- Push to shared list (RC=2) → new allocation, old untouched
- 1000 sequential pushes → amortized O(1), ~10 reallocations
-
AOT integration test (
compiler/ori_llvm/tests/aot/): (2026-02-27,test_coll_list_pushand push_empty/push_multi tests in collections_ext.rs)@test tests { let list = [1, 2, 3] let list = list.push(4) assert_eq(list, [1, 2, 3, 4]) assert_eq(list.length(), 4) }
02.2 COW Pop
File(s): compiler/ori_rt/src/lib.rs
Pop removes and returns the last element. With COW, if unique, we simply decrement len (the element is still in the buffer but inaccessible). If shared, we copy all-but-last.
-
Replace
ori_list_pop_newwith COW-awareori_list_pop: (2026-02-27, implemented asori_list_pop_cowwith consuming semantics. Fast path: unique=decrement len O(1), capacity retained. Slow path: shared=allocate+copy len-1+dec old. Empty list returns sentinel.)/// Removes the last element from the list. Returns the shortened list. /// If unique: decrements len (O(1), element stays in buffer but is logically removed). /// If shared: allocates new buffer with len-1 elements, copies, decs old. /// /// The popped element must be extracted by the caller BEFORE calling pop /// (via list.last() or index access). Pop only shortens the list. #[unsafe(no_mangle)] pub extern "C" fn ori_list_pop( list: OriList, elem_size: usize, elem_align: usize, ) -> OriList { assert!(list.len > 0, "pop on empty list"); let new_len = list.len - 1; if ori_rc_is_unique(list.data as *const u8) { // FAST PATH: just shrink len // NOTE: We do NOT free the popped element here — the caller // extracted it and owns it. We just shrink the logical size. OriList { len: new_len, cap: list.cap, data: list.data } } else { // SLOW PATH: allocate new, copy len-1 elements // ... (similar to push slow path but copies fewer elements) } } -
Capacity reclamation: When a unique list’s
lendrops belowcap / 4, consider shrinking. But this is a tradeoff — shrinking prevents memory waste but causes reallocation. Decision: Do NOT auto-shrink. Let capacity grow but never auto-shrink. Users can explicitly calllist.compact()(future method) to reclaim. This matches Rust’sVecbehavior. (2026-02-27, implemented — fast path retains capacity, verified by cow_pop_to_empty_retains_buffer test) -
Unit tests: (2026-02-27, 4 tests in
ori_rt/src/tests.rs: cow_pop_unique_decrements_len, cow_pop_shared_copies, cow_pop_to_empty_retains_buffer, cow_pop_empty_list_returns_empty. All pass, zero leaks.)- Pop from unique list → same data pointer, len decremented
- Pop from shared list → new allocation, old untouched
- Pop to empty → len=0, data pointer still valid (capacity retained)
02.3 COW Set (Index Assignment)
File(s): compiler/ori_rt/src/lib.rs
Set replaces the element at a given index. With COW, if unique, we overwrite in place. If shared, we copy the whole list, then overwrite.
-
Replace
ori_list_set_newwith COW-awareori_list_set: (2026-02-27, implemented asori_list_set_cowwith consuming semantics. Fast path: unique=overwrite in-place O(1). Slow path: shared=copy entire list+overwrite in copy O(n). Out-of-bounds returns input unchanged. Codegen wiring deferred to §02.7)/// Replaces the element at `index` with `elem`. /// If unique: overwrites in place (O(1)). /// If shared: copies list, overwrites in copy (O(n)). /// /// Element RC: The OLD element at `index` must be decremented by the caller /// (codegen emits this). The new element is moved in (no inc needed). #[unsafe(no_mangle)] pub extern "C" fn ori_list_set( list: OriList, index: i64, elem: *const u8, elem_size: usize, elem_align: usize, ) -> OriList { assert!(index >= 0 && index < list.len, "index out of bounds"); if ori_rc_is_unique(list.data as *const u8) { // FAST PATH: overwrite in place unsafe { let dst = list.data.add((index as usize) * elem_size); std::ptr::copy_nonoverlapping(elem, dst, elem_size); } list // Return same list (data pointer unchanged) } else { // SLOW PATH: clone, then overwrite let new_data = ori_rc_alloc((list.cap as usize) * elem_size, elem_align); unsafe { std::ptr::copy_nonoverlapping( list.data as *const u8, new_data, (list.len as usize) * elem_size, ); let dst = new_data.add((index as usize) * elem_size); std::ptr::copy_nonoverlapping(elem, dst, elem_size); } // inc all elements (they're now referenced by new buffer too) // dec old buffer OriList { len: list.len, cap: list.cap, data: new_data } } } -
Unit tests: (2026-02-27, 3 tests in
ori_rt/src/tests.rs: cow_set_unique_overwrites_in_place, cow_set_shared_copies, cow_set_at_index_zero. All pass, zero leaks.)- Set on unique list → same data pointer, element replaced
- Set on shared list → new allocation, old list unchanged
- Set at index 0, middle, and last position
02.4 COW Insert & Remove
File(s): compiler/ori_rt/src/lib.rs
Insert shifts elements right; remove shifts elements left. With COW, if unique, we shift in place using memmove. If shared, we copy with the shift built into the copy.
-
Add
ori_list_insert: (2026-02-27, implemented asori_list_insert_cowwith consuming semantics. Fast path: unique+capacity=memmove right+write O(n). unique+growth=realloc+memmove+write. Slow path: shared=alloc+copy [0..idx]+elem+copy [idx..len]+dec old. Index 0..=len valid, OOB returns unchanged.)/// Inserts `elem` at `index`, shifting subsequent elements right. /// If unique and has capacity: memmove right + write (O(n) for shift, but /// no allocation — the n elements are already in cache). /// If shared: allocate new, copy [0..index], write elem, copy [index..len]. #[unsafe(no_mangle)] pub extern "C" fn ori_list_insert( list: OriList, index: i64, elem: *const u8, elem_size: usize, elem_align: usize, ) -> OriList { ... } -
Add
ori_list_remove: (2026-02-27, implemented asori_list_remove_cowwith consuming semantics. Fast path: unique=memmove left O(n), unique+empty=ori_rc_free. Slow path: shared=alloc+copy [0..idx]+[idx+1..len]+dec old. Caller extracts element before call.)/// Removes element at `index`, shifting subsequent elements left. /// If unique: memmove left (O(n) for shift, no allocation). /// If shared: allocate new, copy [0..index] + [index+1..len]. /// The removed element must be extracted by caller before this call. #[unsafe(no_mangle)] pub extern "C" fn ori_list_remove( list: OriList, index: i64, elem_size: usize, elem_align: usize, ) -> OriList { ... } -
Unit tests: (2026-02-27, 11 tests: 6 insert (beginning/middle/end unique, growth, shared, empty) + 5 remove (beginning/middle/end unique, last-element-frees, shared). All pass, zero leaks.)
- Insert at beginning, middle, end (unique and shared)
- Remove from beginning, middle, end (unique and shared)
- Insert that triggers growth (unique: realloc + shift, shared: fresh alloc)
- Insert/remove on single-element list
02.5 COW Concat (List Concatenation)
File(s): compiler/ori_rt/src/lib.rs
Concat (list1 + list2) appends all elements of list2 to list1. With COW, if list1 is unique and has capacity for list2, we extend in place.
-
Replace
ori_list_concat_newwith COW-awareori_list_concat: (2026-02-27, implemented asori_list_concat_cowwith consuming semantics for list1, borrowing list2. Fast path: list1 unique+capacity=memcpy list2 O(m). unique+growth=realloc+memcpy. Slow path: shared=alloc+copy both+dec old list1.)/// Concatenates list2 onto list1. /// If list1 is unique: /// - Has capacity: memcpy list2 elements into list1's buffer (O(m)) /// - No capacity: realloc list1 to fit, then memcpy (O(n+m)) /// If list1 is shared: /// - Allocate new buffer for n+m elements, copy both (O(n+m)) /// /// In all cases, elements of list2 need RC inc (they're now in list1 too). /// list2's buffer RC is NOT decremented — the caller manages list2's lifetime. #[unsafe(no_mangle)] pub extern "C" fn ori_list_concat( list1: OriList, list2: OriList, elem_size: usize, elem_align: usize, ) -> OriList { ... } -
Optimization: consume list2 when unique: If
list2is also uniquely owned and has no remaining references, we can move its elements without incrementing their RCs. This requires passing a flag or checking list2’s uniqueness too. (2026-02-28, already implemented:ori_list_concat_cowchecksdata2_unique = ori_rc_is_unique(data2),copy_list2_elementsskipsinc_fnwhenlist2_unique=true,dec_consumed_list2frees buffer directly. 11 unit tests pass includingcow_concat_both_unique_no_incverifying zero inc_fn calls.)Decision: Check both lists’ uniqueness. Four cases:
- Both unique: move list2’s elements (no element RC changes), realloc list1 if needed
- list1 unique, list2 shared: copy list2’s elements (inc each), extend list1
- list1 shared, list2 unique: allocate new, copy list1 (inc each), move list2
- Both shared: allocate new, copy both (inc all elements)
-
Unit tests: (2026-02-27, 5 tests: cow_concat_unique_with_capacity, cow_concat_unique_needs_growth, cow_concat_shared_copies, cow_concat_empty_lists, cow_concat_empty_list1. All pass, zero leaks.)
- Concat where list1 has capacity → no realloc
- Concat where list1 needs growth → realloc
- Concat with shared list1 → new allocation
- Concat empty lists (sentinels)
- Concat large lists (1000+ elements)
- Concat list with itself (
list + list)
02.6 COW Reverse & Sort
File(s): compiler/ori_rt/src/lib.rs
Reverse and sort are in-place algorithms when the list is unique.
-
Add COW
ori_list_reverse: (2026-02-27, implemented asori_list_reverse_cowwith consuming semantics. Fast path: unique=swap loop in-place O(n). Slow path: shared=alloc+copy in reverse order+dec old. Single/empty elements returned unchanged.)/// Reverses the list. /// If unique: reverse in place using swap loop (O(n), no allocation). /// If shared: allocate new, copy in reverse order (O(n)). #[unsafe(no_mangle)] pub extern "C" fn ori_list_reverse( list: OriList, elem_size: usize, elem_align: usize, ) -> OriList { ... } -
Add COW
ori_list_sort: (2026-02-27, implemented asori_list_sort_cowwith consuming semantics. Uses index-sort + cycle-following permutation. Fast path: unique=sort indices+permute in-place O(n log n). Slow path: shared=sort indices+copy in sorted order+dec old. compare_fn has C ABI.)/// Sorts the list using the provided comparison function. /// If unique: sort in place (O(n log n), no allocation). /// If shared: copy, then sort the copy (O(n) copy + O(n log n) sort). /// /// The comparison function has signature: (a: *const u8, b: *const u8) -> i32 /// returning negative (a < b), zero (a == b), positive (a > b). #[unsafe(no_mangle)] pub extern "C" fn ori_list_sort( list: OriList, elem_size: usize, elem_align: usize, compare_fn: extern "C" fn(*const u8, *const u8) -> i32, ) -> OriList { ... } -
Sort algorithm choice: Uses Rust’s
Vec::sort_unstable_by(pattern-defeating quicksort) on an index array, then applies the permutation. Unstable sort by default — no allocation beyond the index and visited arrays. (2026-02-27) -
Stable sort: Add
sort_stablemethod using Rust’sVec::sort_by(TimSort). Same COW pattern as unstable sort, but preserves equal-element order. (2026-02-28, extracted sharedlist_sort_cow_implhelper withstable: boolflag. Wired through all 8 pipeline locations: TYPECK, resolve_list_method, MethodNames, evaluator dispatch, EVAL_BUILTIN_METHODS, ARC CONSUMING_RECEIVER_METHOD_NAMES, LLVM dispatch+emitter, runtime_decl+AOT_ONLY list. Tests: 3 runtime unit tests (stability, all-equal, shared COW), 2 AOT integration, 6 Ori spec tests.) -
Unit tests: (2026-02-27, 12 tests: 5 reverse (unique even/odd, shared, single, empty) + 7 sort (unique, shared, already-sorted, reverse-sorted, duplicates, single, empty). All pass, zero leaks.)
- Reverse unique list → same data pointer, elements reversed
- Reverse shared list → new allocation, original unchanged
- Sort unique list → same data pointer, elements sorted
- Sort shared list → new allocation
- Sort already-sorted list (best case)
- Sort reverse-sorted list (worst case for naive quicksort)
- Sort with duplicates
- Sort empty and single-element lists
02.7 LLVM Codegen Updates
File(s): compiler/ori_llvm/src/codegen/arc_emitter/builtins/collections.rs, compiler/ori_llvm/src/codegen/runtime_decl/mod.rs
All existing list method emitters must be updated to call the new COW runtime functions instead of the old _new variants.
-
Update
emit_list_pushto callori_list_push_cowinstead ofori_list_push_new: (2026-02-27, passes elem_size + elem_align, extracts cap field for COW fast path)- Pass
elem_sizeandelem_alignas additional arguments - Handle element RC on the slow path (emit inc loop for RC-typed elements)
- Pass
-
Update
emit_list_pop→ read-onlylast(): (2026-02-27) pop() in Ori’s value semantics acts as a read-only accessor returningOption<T>(same as last()). LLVM codegen routes pop toemit_list_last()withborrow: true. Evaluator wired: pop dispatches as alias for last indispatch_list_method. COWori_list_pop_cowpreserved with#[expect(dead_code)]for future mutation syntax. Spec tests, AOT tests, Rosetta stack test updated. -
Update
emit_list_set→ori_list_set_cow: (2026-02-27, added TYPECK entry, wired dispatch, evaluator implementation with bounds checking)- Emit
ori_rc_decfor the OLD element before the set call - The new element is moved in (no inc)
- Emit
-
Update
emit_list_concat→ori_list_concat_cow: (2026-02-27, passes cap1, uses elem_align; also updated+operator in operators.rs) -
Wire emitters for
insert,remove,sort: (2026-02-27, all wired: insert/remove/sort dispatch in mod.rs, TYPECK entries added, evaluator sort uses compare_values with error propagation, LLVM sort generates per-type compare thunks via get_or_create_compare_thunk — supports int/float/bool/char/byte/str) -
Element RC in slow path: (2026-02-27) Added
inc_fn: Option<extern "C" fn(*mut u8)>callback to all 8 COW functions inori_rt. Runtime callsinc_fnon each byte-copied element during slow path. Codegen generates per-typeelem_inc_fn(mirrorselem_dec_fnpattern) viaget_or_generate_elem_inc_fn()inArcIrEmitter. Concat special-cases: inc list2 elements on ALL paths (always byte-copied), inc list1 elements only on slow path. Set/insert skip the overwritten/inserted index. Scalar types pass null (no-op). -
Update runtime_decl/tests.rs to verify new function signatures: (2026-02-27, all COW functions were already declared in RT_FUNCTIONS from Section 01;
declared_functions_covered_by_jit_or_aot_onlytest passes)
02.8 Completion Checklist
-
ori_list_push— unique list with capacity: same pointer, O(1) (2026-02-27, verified:cow_push_unique_with_capacityasserts same data pointer) -
ori_list_push— unique list without capacity: realloc, doubled cap (2026-02-27, verified:cow_push_unique_needs_growthasserts cap >= 8) -
ori_list_push— shared list: new allocation, old untouched (2026-02-27, verified:cow_push_shared_list_copiesasserts different pointer + old data intact) -
ori_list_push— empty (sentinel): allocates MIN_CAPACITY (2026-02-27, verified:cow_push_to_empty_sentinelasserts cap >= 4) -
ori_list_pop— unique: same pointer, len decremented (2026-02-27, verified:cow_pop_unique_decrements_lenasserts same pointer + len=2) -
ori_list_pop— shared: new allocation (2026-02-27, verified:cow_pop_shared_copiesasserts different pointer) -
ori_list_set— unique: in-place overwrite (2026-02-27, verified:cow_set_unique_overwrites_in_placeasserts same pointer) -
ori_list_set— shared: copy + overwrite (2026-02-27, verified:cow_set_shared_copiesasserts different pointer) -
ori_list_insert— unique with capacity: memmove + write (2026-02-27, verified:cow_insert_unique_at_beginning/middle/endassert same pointer) -
ori_list_insert— shared: new allocation with element inserted (2026-02-27, verified:cow_insert_shared_copiesasserts different pointer) -
ori_list_remove— unique: memmove left (2026-02-27, verified:cow_remove_unique_at_beginning/middle/endassert same pointer) -
ori_list_remove— shared: new allocation without element (2026-02-27, verified:cow_remove_shared_copiesasserts different pointer) -
ori_list_concat— unique with capacity: memcpy append (2026-02-27, verified:cow_concat_unique_with_capacityasserts same pointer) -
ori_list_concat— shared: new allocation with both lists (2026-02-27, verified:cow_concat_shared_list1_unique_list2asserts different pointer) -
ori_list_reverse— unique: in-place reverse (2026-02-27, verified:cow_reverse_unique_in_placeasserts same pointer + reversed elements) -
ori_list_sort— unique: in-place sort (2026-02-27, verified:cow_sort_unique_in_placeasserts same pointer + sorted elements) - All LLVM codegen emitters updated to use COW functions (2026-02-27)
- Element RC handled correctly on slow path (inc on copy, dec on replace) (2026-02-27)
- 1000-push benchmark: O(N) total, not O(N²) (2026-02-28,
test_coll_list_push_loop_1000AOT test passes — 1000 pushes with correct first/last values) - Valgrind clean: no leaks, no use-after-free, no double-free (2026-02-28, all 7 COW paths clean under Valgrind — push, insert, remove, sort, reverse, concat, chained push+concat)
-
./test-all.shgreen (2026-02-28, 10429 passed / 0 failed — includes ARC consuming fixes) -
./clippy-all.shgreen (2026-02-27) - AOT integration tests for all operations (2026-02-28, 18 new tests in
collections_ext.rs: set, insert, remove, sort, push_loop_1000, 5 COW sharing tests, chained push_concat/concat_reverse)
Exit Criteria: A program that pushes 10,000 elements to a list completes in O(N) time (measurable via benchmark). Sharing a list and then mutating the copy triggers exactly one O(N) copy, after which subsequent mutations are O(1). Valgrind reports zero errors on all list COW test programs. ./test-all.sh and ./llvm-test.sh both pass.