0%

Section 06: Runtime Identity Fixes

Status: Not Started Goal: Fix runtime string concat and list concat so that identity operations (concatenating with an empty value) return the original value instead of allocating a copy. This is a prerequisite for Section 08 (algebra law adoption) and Section 09 (algebraic normalization) — the compiler cannot safely eliminate x + "" if the runtime’s ori_str_concat has different observable behavior (different allocation count) than just returning x.

Critical distinction (from Codex consensus round 3):

  • Strings borrow their inputsori_str_concat takes *const OriStr (borrowed). To return the original, we need rc_inc on the retained string’s heap data so it survives the caller’s rc_dec.
  • Lists consume their inputsori_list_concat_cow takes ownership. To return the original, we just transfer ownership (no rc_inc, no copy).

Success Criteria:

  • ORI_TRACE_RC=1 shows zero allocations for "hello" + "" and "" + "hello" (heap strings)
  • ORI_TRACE_RC=1 shows zero allocations for [] + [1, 2, 3] and [1, 2, 3] + []
  • SSO strings ("hi" + "") return SSO value directly (no RC operations needed)
  • Slice-backed strings (s.split("x")[0] + "") correctly retain via ori_str_rc_inc (slice-aware)
  • Slice-backed lists (list[1..3] + []) correctly transfer with slice cap preservation
  • ORI_CHECK_LEAKS=1 reports zero leaks for all identity operation test programs
  • Existing ori_rt tests updated to reflect new behavior (any tests that assert “owned copy” for empty operand)

Context: Currently, ori_str_concat (ops.rs:194-199) handles empty operands with return OriStr::from_bytes(a_bytes) which creates a new SSO or heap string — it does not return the original. This means x + "" and x produce different heap allocations with different reference counts. For the algebraic normalization pass (Section 09) to safely rewrite x + "" → x, the runtime must guarantee that x + "" and x are observationally equivalent (same backing storage, same RC topology).

Depends on: Nothing (can start immediately).


06.1 Fix ori_str_concat Identity

File(s): compiler/ori_rt/src/string/ops.rs (lines 193-199)

Replace the OriStr::from_bytes() copy path with an alias-preserving return for empty operands.

  • Write failing tests first in compiler/ori_rt/src/string/ops/tests.rs:

    • test_concat_empty_rhs_heap_preserves_pointer: heap string x + "" → result data pointer == original data pointer
    • test_concat_empty_lhs_heap_preserves_pointer: "" + x → result data pointer == x’s data pointer
    • test_concat_empty_rhs_sso_returns_sso: SSO string "hi" + "" → result is SSO (no heap alloc)
    • test_concat_empty_rhs_slice_preserves_slice: slice string + "" → result is slice-backed
  • Replace lines 194-199 in ops.rs:

    // OLD: return OriStr::from_bytes(a_bytes);
    // NEW:
    if b_len == 0 {
        // Return `a` as-is. Since ori_str_concat borrows inputs,
        // we must rc_inc the retained string's heap data so it
        // survives the caller's rc_dec of the original `a`.
        if a_ref.is_sso() {
            // SSO: return by value, no RC needed
            return *a_ref;
        } else {
            // Heap or slice: rc_inc the data pointer (slice-aware via cap)
            let heap = unsafe { &a_ref.heap };
            if !heap.data.is_null() {
                ori_str_rc_inc(heap.data, heap.cap); // 2 params: data_ptr + cap
            }
            return *a_ref;
        }
    }
    if a_len == 0 {
        // Same logic for empty `a`, retain `b`
        if b_ref.is_sso() {
            return *b_ref;
        } else {
            let heap = unsafe { &b_ref.heap };
            if !heap.data.is_null() {
                ori_str_rc_inc(heap.data, heap.cap); // 2 params: data_ptr + cap
            }
            return *b_ref;
        }
    }
  • Verify: ori_str_rc_inc(data_ptr, cap) signature — takes 2 params (confirmed: rc/mod.rs:343 takes *mut u8 + i64). The cap parameter enables slice-aware RC: negative cap values (SLICE_FLAG set) are handled correctly to find the original buffer’s header.

  • Verify: timeout 150 cargo t -p ori_rt passes

  • Update any existing tests that assert from_bytes behavior for empty operands

  • /tpr-review passed — independent review found no critical or major issues (or all findings triaged)

  • /impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.

  • Subsection close-out (06.1) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-06.1 retrospectivebuild/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 06.1: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.

  • /sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.

  • Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.


06.2 Fix ori_list_concat_cow Identity

File(s): compiler/ori_rt/src/list/cow_sort/mod.rs (lines 128-161)

The [] + x path (lines 143-161) already handles the unique case correctly (takeover, O(1)). Fix the shared/slice case to also be a true identity — transfer ownership rather than copying.

  • Write failing tests first in compiler/ori_rt/src/list/cow_sort/tests.rs (or compiler/ori_rt/src/tests.rs):

    • test_concat_empty_lhs_shared_transfers: [] + shared_list → result is the shared list (rc_inc, not copy)
    • test_concat_empty_lhs_slice_transfers: [] + slice_list → result preserves slice encoding
    • test_concat_empty_rhs_unique_noop: unique_list + [] → result is the list unchanged (already works)
  • Fix the shared/slice path for list1 empty (cow_sort/mod.rs ~line 143-161):

    // For [] + x: transfer ownership of x regardless of sharing status.
    // Since list concat CONSUMES both inputs, we just write x's {len, cap, data}
    // to out_ptr and do NOT dec x (the caller already transferred ownership).
    // We DO dec the empty list1 if it has storage.
    if len1 == 0 {
        // Write list2 fields to output
        write_list_fields(out_ptr, data2, len2, cap2);
        // Dec empty list1 if it owns storage (cap != 0 and not null)
        if cap1 != 0 && !data1.is_null() {
            ori_buffer_rc_dec(data1, std::ptr::null(), 0);
        }
        return;
    }
  • The x + [] path (lines 136-141) already returns list1 unchanged — verify it still works

  • Verify: timeout 150 cargo t -p ori_rt passes

  • Verify: ORI_CHECK_LEAKS=1 reports zero leaks for identity operations

  • /tpr-review passed — independent review found no critical or major issues (or all findings triaged)

  • /impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.

  • Subsection close-out (06.2) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-06.2 retrospectivebuild/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 06.2: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.

  • /sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.

  • Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.


06.3 Identity Operation Test Matrix

File(s): tests/spec/expressions/identity_operations.ori (NEW), compiler/ori_llvm/tests/aot/identity_ops.rs (NEW)

Comprehensive test matrix verifying identity operations produce correct results and correct RC behavior.

  • Create Ori spec test tests/spec/expressions/identity_operations.ori (NOTE: MUST include use std.testing { assert_eq } — not in prelude):
    • "hello" + "" == "hello" (str right identity)
    • "" + "hello" == "hello" (str left identity)
    • [1, 2, 3] + [] == [1, 2, 3] (list right identity)
    • [] + [1, 2, 3] == [1, 2, 3] (list left identity)
    • "" + "" == "" (both empty)
    • [] + [] == [] (both empty)
  • Create AOT test compiler/ori_llvm/tests/aot/identity_ops.rs:
    • Test RC balance: ORI_TRACE_RC=1 shows balanced inc/dec for identity ops
    • Test with different element types: [str], [Option<int>], [[int]] (nested)
    • Test with SSO vs heap strings
    • Run in both debug and release (cargo t -p ori_llvm + cargo t -p ori_llvm --release)
  • Create Valgrind test tests/valgrind/identity_ops.ori:
    • Exercise all identity patterns with Valgrind to catch use-after-free or leaks

Matrix dimensions:

  • Types: str (SSO), str (heap), str (slice), [int], [str], [Option<int>], [[int]] (nested list)

  • Patterns: x + empty, empty + x, empty + empty, x + empty where x is shared, x + empty where x is unique

  • Semantic pin: ORI_TRACE_RC=1 output for "hello" + "" shows zero allocations (only passes with the fix)

  • Negative pin: NOT testing float identity (excluded by design)

  • /tpr-review passed — independent review found no critical or major issues (or all findings triaged)

  • /impl-hygiene-review passed — hygiene review clean. MUST run AFTER /tpr-review is clean.

  • Subsection close-out (06.3) — MANDATORY before starting the next subsection. Run /improve-tooling retrospectively on THIS subsection’s debugging journey (per .claude/skills/improve-tooling/SKILL.md “Per-Subsection Workflow”): which diagnostics/ scripts you ran, where you added dbg!/tracing calls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE /commit-push using a valid conventional-commit type (build(diagnostics): ... — surfaced by section-06.3 retrospectivebuild/test/chore/ci/docs are valid; tools(...) is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 06.3: no tooling gaps”. Update this subsection’s status in section frontmatter to complete.

  • /sync-claude section-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW.

  • Repo hygiene check — run diagnostics/repo-hygiene.sh --check and clean any detected temp files.


06.R Third Party Review Findings

  • None.

06.N Completion Checklist

  • ori_str_concat returns alias for empty operands (heap: rc_inc, SSO: value copy, slice: slice-aware rc_inc)
  • ori_list_concat_cow transfers ownership for [] + x (no copy for shared/slice)
  • Identity test matrix covers all type × pattern combinations
  • ORI_TRACE_RC=1 confirms zero extra allocations
  • ORI_CHECK_LEAKS=1 reports zero leaks
  • Valgrind test passes
  • timeout 150 ./test-all.sh green (debug AND release — runtime identity affects AOT binaries, release must be tested)
  • Plan annotation cleanup
  • Plan sync — update plan metadata
    • Section 08 depends_on verified (runtime identity must land before stdlib law adoption)
  • /tpr-review passed
  • /impl-hygiene-review passed
  • /improve-tooling retrospective completed — MANDATORY at section close, after both reviews are clean. Reflect on the section’s debugging journey (which diagnostics/ scripts you ran, which command sequences you repeated, where you added ad-hoc dbg!/tracing calls, where output was hard to interpret) and identify any tool/log/diagnostic improvement that would have made this section materially easier OR that would help the next section touching this area. Implement every accepted improvement NOW (zero deferral) and commit each via SEPARATE /commit-push. The retrospective is mandatory even when nothing felt painful — that is exactly when blind spots accumulate. See .claude/skills/improve-tooling/SKILL.md “Retrospective Mode” for the full protocol.

Exit Criteria: ORI_TRACE_RC=1 ./binary shows zero allocations for all identity operations. Valgrind clean. All existing tests pass (updated where behavior changed). The runtime now implements algebraic identity correctly.