All Journeys
Journey #17 Complex

I am a captured fat pointer

Closure capturing a string (fat pointer) with method dispatch and ARC management

9.8
Score
PASS Status
10 Expected
PASS Overflow

What you'll learn

  • See how a fat pointer (str) is captured into a heap-allocated closure environment
  • Understand the complete lifecycle: alloc env, store captures, call, drop env
  • Observe SSO-aware RC dec sequences for captured strings
  • Compare closure calling convention with direct function calls

Score Breakdown

stringsarcclosurescapturehigher order

Journey 17: “I am a captured fat pointer”

Source

// Journey 17: "I am a captured fat pointer"
// Slug: fat-closure-capture
// Difficulty: complex
// Features: strings, arc, closures, capture, higher_order
// Expected: check_capture() = 10
// NOTE: This journey exposes a compiler bug -- closure capturing str
//       triggers unresolved type variable at codegen (Idx leak)

@check_capture () -> int = {
    let prefix = "hello";
    let f = s -> prefix.length() + s.length();
    f("world")
}

@main () -> int = check_capture();

Execution Results

BackendExit CodeExpectedStdoutStderrStatus
Eval1010(none)(none)PASS
AOT1010(none)(none)PASS

Compiler Pipeline

1. Lexer

The lexer (tokenizer) breaks raw source text into a stream of tokens — the smallest meaningful units like keywords, identifiers, operators, and literals.

Tokens: 62 | Keywords: 4 | Identifiers: 12 | Errors: 0

Token stream
Fn(@) Ident(check_capture) LParen RParen Arrow Ident(int) Eq
LBrace Let Ident(prefix) Eq Str("hello") Semi
Let Ident(f) Eq Ident(s) Arrow Ident(prefix) Dot Ident(length)
LParen RParen Plus Ident(s) Dot Ident(length) LParen RParen Semi
Ident(f) LParen Str("world") RParen RBrace
Fn(@) Ident(main) LParen RParen Arrow Ident(int) Eq
Ident(check_capture) LParen RParen Semi

2. Parser

The parser transforms the flat token stream into a hierarchical Abstract Syntax Tree (AST) — a tree structure that represents the grammatical structure of the program.

Nodes: 14 | Max depth: 4 | Functions: 2 | Errors: 0

AST (simplified)
Module
├─ FnDecl @check_capture
│  ├─ Params: ()
│  ├─ Return: int
│  └─ Body: Block
│       ├─ Let prefix = Str("hello")
│       ├─ Let f = Lambda(s)
│       │       └─ BinOp(+)
│       │            ├─ MethodCall(prefix, length, [])
│       │            └─ MethodCall(s, length, [])
│       └─ Call(f, [Str("world")])
└─ FnDecl @main
   ├─ Return: int
   └─ Body: Call(@check_capture, [])

3. Type Checker

The type checker verifies that all expressions have compatible types using Hindley-Milner type inference. It resolves type variables, checks constraints, and ensures type safety without requiring explicit type annotations everywhere.

Constraints: 12 | Types inferred: 6 | Unifications: 10 | Errors: 0

Inferred types
@check_capture () -> int = {
    let prefix: str = "hello";
    //                 ^ str (literal)
    let f: (str) -> int = s -> prefix.length() + s.length();
    //     ^ inferred: (str) -> int
    //       s: str (inferred from closure body)
    //       prefix.length(): int (str method)
    //       s.length(): int (str method)
    //       + : (int, int) -> int
    f("world")
    // ^ int (return type of f)
}

@main () -> int = check_capture()
//                ^ int (return type of @check_capture)

4. Canonicalization

The canonicalizer transforms the typed AST into a simplified canonical form. It desugars syntactic sugar, lowers complex expressions, and prepares the IR for backend consumption.

Transforms: 15 | Desugared: 0 | Errors: 0

Key transformations
- Lambda lowered to canonical closure form with capture list [prefix]
- Method calls (prefix.length(), s.length()) resolved to str.length
- Function bodies lowered to canonical expression form
- Call arguments normalized to positional order

5. ARC Pipeline

The ARC (Automatic Reference Counting) pipeline analyzes value lifetimes and inserts reference counting operations. It performs borrow inference to minimize RC overhead — parameters that are only read can be borrowed rather than owned.

RC ops inserted: 5 | Elided: 0 | Net ops: 5

ARC annotations
@check_capture:
  +1 ori_rc_alloc (closure env)
  +2 ori_str_from_raw (creates two strings: "hello", "world")
  -1 ori_str_rc_dec on "world" str (runtime SSO-aware check)
  -1 ori_rc_dec on closure env (via drop_fn dispatch)
  Net: 3 inc, 2 dec -- ownership of "hello" transferred into closure env

@__lambda_check_capture_0:
  No RC ops -- borrows both captured str and parameter str

@partial_0_drop:
  -1 ori_rc_dec on captured str (SSO-aware via select)
  +1 ori_rc_free on env allocation
  Net: drop function, consumes ownership

@partial_1:
  No RC ops -- forwarding shim

Backend: Interpreter

The interpreter (eval path) executes the canonical IR directly, without compilation. It serves as the reference implementation for correctness testing.

Result: 10 | Status: PASS

Evaluation trace
@main()
  └─ @check_capture()
       ├─ let prefix = "hello"
       ├─ let f = <closure capturing prefix>
       └─ f("world")
            └─ prefix.length() + s.length()
                 ├─ "hello".length() = 5
                 ├─ "world".length() = 5
                 └─ 5 + 5 = 10
-> 10

Backend: LLVM Codegen

The LLVM backend compiles the canonical IR to LLVM IR, which is then compiled to native machine code via LLVM’s optimization and code generation pipeline. This path produces ahead-of-time compiled binaries.

ARC Pipeline

RC ops inserted: 5 | Elided: 0 | Net ops: 5

ARC annotations
@check_capture: +3 rc_inc (alloc env, 2x str_from_raw), +2 rc_dec (str_rc_dec, closure env) -- ownership transfer
@__lambda_check_capture_0: +0 rc_inc, +0 rc_dec (borrows only)
@partial_0_drop: +0 rc_inc, +1 rc_dec + 1 rc_free (teardown)
@partial_1: +0 rc_inc, +0 rc_dec (forwarding shim)
@_ori_drop$3: +0 rc_inc, +0 rc_dec, +1 rc_free (str data drop)

Generated LLVM IR

; ModuleID = '17-fat-closure-capture'
source_filename = "17-fat-closure-capture"

@ovf.msg = private unnamed_addr constant [29 x i8] c"integer overflow on addition\00", align 1
@str = private unnamed_addr constant [6 x i8] c"hello\00", align 1
@str.1 = private unnamed_addr constant [6 x i8] c"world\00", align 1

; Function Attrs: nounwind uwtable
; --- @check_capture ---
define fastcc noundef i64 @_ori_check_capture() #0 {
bb0:
  %sret.tmp1 = alloca { i64, i64, ptr }, align 8
  %sret.tmp = alloca { i64, i64, ptr }, align 8
  call void @ori_str_from_raw(ptr %sret.tmp, ptr @str, i64 5)
  %sret.load = load { i64, i64, ptr }, ptr %sret.tmp, align 8
  %env.data = call ptr @ori_rc_alloc(i64 32, i64 8)
  %env.drop_fn = getelementptr inbounds nuw { ptr, { i64, i64, ptr } }, ptr %env.data, i32 0, i32 0
  store ptr @_ori_partial_0_drop, ptr %env.drop_fn, align 8
  %env.cap.0 = getelementptr inbounds nuw { ptr, { i64, i64, ptr } }, ptr %env.data, i32 0, i32 1
  store { i64, i64, ptr } %sret.load, ptr %env.cap.0, align 8
  %partial_apply.1 = insertvalue { ptr, ptr } { ptr @_ori_partial_1, ptr undef }, ptr %env.data, 1
  call void @ori_str_from_raw(ptr %sret.tmp1, ptr @str.1, i64 5)
  %sret.load2 = load { i64, i64, ptr }, ptr %sret.tmp1, align 8
  %closure.fn_ptr = extractvalue { ptr, ptr } %partial_apply.1, 0
  %closure.env_ptr = extractvalue { ptr, ptr } %partial_apply.1, 1
  %icall.arg.tmp = alloca { i64, i64, ptr }, align 8
  store { i64, i64, ptr } %sret.load2, ptr %icall.arg.tmp, align 8
  %icall = call i64 %closure.fn_ptr(ptr %closure.env_ptr, ptr %icall.arg.tmp)
  %rc_dec.fat_data = extractvalue { i64, i64, ptr } %sret.load2, 2
  %rc_dec.fat_cap = extractvalue { i64, i64, ptr } %sret.load2, 1
  call void @ori_str_rc_dec(ptr %rc_dec.fat_data, i64 %rc_dec.fat_cap, ptr @"_ori_drop$3")
  %rc_dec.env = extractvalue { ptr, ptr } %partial_apply.1, 1
  %rc_dec.null.p2i = ptrtoint ptr %rc_dec.env to i64
  %rc_dec.null = icmp eq i64 %rc_dec.null.p2i, 0
  br i1 %rc_dec.null, label %rc_dec.skip, label %rc_dec.do

rc_dec.do:                                        ; preds = %bb0
  %rc_dec.drop_fn = load ptr, ptr %rc_dec.env, align 8
  call void @ori_rc_dec(ptr %rc_dec.env, ptr %rc_dec.drop_fn)  ; RC--
  br label %rc_dec.skip

rc_dec.skip:                                      ; preds = %rc_dec.do, %bb0
  ret i64 %icall
}

; Function Attrs: nounwind uwtable
; --- @main ---
define noundef i64 @_ori_main() #0 {
bb0:
  %call = call fastcc i64 @_ori_check_capture()
  ret i64 %call
}

; Function Attrs: nounwind uwtable
; --- @__lambda_check_capture_0 ---
define fastcc noundef i64 @_ori___lambda_check_capture_0(ptr noundef nonnull dereferenceable(24) %0, ptr noundef nonnull dereferenceable(24) %1) #0 {
bb0:
  %param.load = load { i64, i64, ptr }, ptr %1, align 8
  %str.len = call i64 @ori_str_len(ptr %0)
  %str.len1 = call i64 @ori_str_len(ptr %1)
  %add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %str.len, i64 %str.len1)
  %add.val = extractvalue { i64, i1 } %add, 0
  %add.ovf = extractvalue { i64, i1 } %add, 1
  br i1 %add.ovf, label %add.ovf_panic, label %add.ok

add.ok:                                           ; preds = %bb0
  ret i64 %add.val

add.ovf_panic:                                    ; preds = %bb0
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable
}

; Function Attrs: cold nounwind uwtable
; --- @partial_0_drop ---
define void @_ori_partial_0_drop(ptr noundef %0) #4 {
entry:
  %cap.0.ptr = getelementptr inbounds nuw { ptr, { i64, i64, ptr } }, ptr %0, i32 0, i32 1
  %cap.0 = load { i64, i64, ptr }, ptr %cap.0.ptr, align 8
  %rc.data_ptr = extractvalue { i64, i64, ptr } %cap.0, 2
  %rc_str.p2i = ptrtoint ptr %rc.data_ptr to i64
  %rc_str.sso_flag = and i64 %rc_str.p2i, -9223372036854775808
  %rc_str.is_sso = icmp ne i64 %rc_str.sso_flag, 0
  %rc_str.is_null = icmp eq i64 %rc_str.p2i, 0
  %rc_str.skip_rc = or i1 %rc_str.is_sso, %rc_str.is_null
  %rc.str_safe_ptr = select i1 %rc_str.skip_rc, ptr null, ptr %rc.data_ptr
  call void @ori_rc_dec(ptr %rc.str_safe_ptr, ptr @"_ori_drop$3")  ; RC-- str
  call void @ori_rc_free(ptr %0, i64 32, i64 8)
  ret void
}

; Function Attrs: cold nounwind uwtable
; --- drop str ---
define void @"_ori_drop$3"(ptr noundef %0) #4 {
entry:
  call void @ori_rc_free(ptr %0, i64 24, i64 8)
  ret void
}

; Function Attrs: nounwind uwtable
; --- @partial_1 ---
define noundef i64 @_ori_partial_1(ptr noundef %0, ptr noundef %1) #0 {
entry:
  %cap.0.ptr = getelementptr inbounds nuw { ptr, { i64, i64, ptr } }, ptr %0, i32 0, i32 1
  %result = call fastcc i64 @_ori___lambda_check_capture_0(ptr %cap.0.ptr, ptr %1)
  ret i64 %result
}

Disassembly

_ori_check_capture:
  sub    $0x98,%rsp
  lea    str(%rip),%rsi
  lea    0x68(%rsp),%rdi
  mov    $0x5,%edx
  mov    %rdx,0x18(%rsp)
  call   ori_str_from_raw          ; create "hello"
  mov    0x68(%rsp),%rax           ; load str triple
  mov    %rax,0x10(%rsp)
  mov    0x70(%rsp),%rax
  mov    %rax,0x8(%rsp)
  mov    0x78(%rsp),%rax
  mov    %rax,(%rsp)
  mov    $0x20,%edi
  mov    $0x8,%esi
  call   ori_rc_alloc              ; alloc 32-byte env
  ; store drop_fn + captured str into env
  mov    (%rsp),%rdi
  mov    0x8(%rsp),%rsi
  mov    0x10(%rsp),%rcx
  mov    0x18(%rsp),%rdx
  mov    %rax,0x28(%rsp)
  lea    _ori_partial_0_drop(%rip),%r8
  mov    %r8,(%rax)               ; drop_fn at offset 0
  mov    %rdi,0x18(%rax)           ; str data_ptr
  mov    %rsi,0x10(%rax)           ; str cap
  mov    %rcx,0x8(%rax)            ; str len
  lea    _ori_partial_1(%rip),%rcx
  mov    %rcx,0x20(%rsp)           ; fn_ptr
  mov    %rax,0x48(%rsp)           ; env_ptr
  lea    str.1(%rip),%rsi
  lea    0x80(%rsp),%rdi
  call   ori_str_from_raw          ; create "world"
  ; indirect call through closure
  mov    0x20(%rsp),%rax           ; fn_ptr
  mov    0x28(%rsp),%rdi           ; env_ptr
  ; ... store world str, call, cleanup
  call   *%rax                     ; f("world")
  ; cleanup: ori_str_rc_dec on "world", ori_rc_dec on env
  call   ori_str_rc_dec
  cmp    $0x0,%rax                 ; null check env
  je     .skip
  call   ori_rc_dec                ; RC-- env
.skip:
  add    $0x98,%rsp
  ret

_ori_main:
  push   %rax
  call   _ori_check_capture
  pop    %rcx
  ret

_ori___lambda_check_capture_0:
  sub    $0x18,%rsp
  mov    %rsi,(%rsp)               ; save param ptr
  call   ori_str_len               ; prefix.length()
  mov    (%rsp),%rdi               ; restore param ptr
  mov    %rax,0x8(%rsp)            ; save prefix_len
  call   ori_str_len               ; s.length()
  mov    %rax,%rcx
  mov    0x8(%rsp),%rax
  add    %rcx,%rax                 ; prefix_len + s_len
  jo     .panic                    ; overflow check
  add    $0x18,%rsp
  ret

_ori_partial_0_drop:
  push   %rax
  mov    %rdi,%rax
  mov    %rax,(%rsp)
  mov    0x18(%rax),%rdi           ; load captured str data ptr
  ; SSO check (bit 63) + null check -> select -> ori_rc_dec
  movabs $0x8000000000000000,%rcx
  and    %rcx,%rax
  cmp    $0x0,%rax
  setne  %cl
  cmp    $0x0,%rdi
  sete   %al
  or     %al,%cl
  xor    %eax,%eax
  test   $0x1,%cl
  cmovne %rax,%rdi                 ; select: skip_rc ? null : data_ptr
  lea    _ori_drop$3(%rip),%rsi
  call   ori_rc_dec                ; RC-- captured str (null-safe)
  mov    (%rsp),%rdi
  mov    $0x20,%esi
  mov    $0x8,%edx
  call   ori_rc_free               ; free env (32 bytes, align 8)
  pop    %rax
  ret

_ori_drop$3:
  push   %rax
  mov    $0x18,%esi
  mov    $0x8,%edx
  call   ori_rc_free               ; free str data (24 bytes, align 8)
  pop    %rax
  ret

_ori_partial_1:
  push   %rax
  add    $0x8,%rdi                 ; GEP past drop_fn to captured str
  call   _ori___lambda_check_capture_0
  pop    %rcx
  ret

Deep Scrutiny

1. Instruction Purity

#FunctionActualIdealRatioVerdict
1@check_capture28281.00xOPTIMAL
2@main221.00xOPTIMAL
3@__lambda_check_capture_01091.11xNEAR-OPTIMAL
4@partial_0_drop12121.00xOPTIMAL
5@_ori_drop$3221.00xOPTIMAL
6@partial_1331.00xOPTIMAL

The lambda has one dead instruction: %param.load = load { i64, i64, ptr }, ptr %1 loads the full 24-byte str triple but the result is never used. The function only needs ptr %1 for the ori_str_len call. LLVM’s dead code elimination will remove this in optimized builds, but it represents unnecessary work in debug mode. [LOW-1]

2. ARC Purity

Functionrc_incrc_decBalancedBorrow ElisionMove Semantics
@check_capture32TRANSFERN/A1 ownership transfer
@main00YESN/AN/A
@__lambda00YES2 borrowsN/A
@partial_0_drop01+freeTEARDOWNN/Aconsumes env
@_ori_drop$300+freeTEARDOWNN/Afrees str data
@partial_100YES1 forwardN/A

Verdict: ARC is correctly balanced across the closure lifecycle. check_capture allocates the env (+1) and creates two strings (+2), then drops the “world” string via ori_str_rc_dec (-1) and the closure env via ori_rc_dec with drop_fn dispatch (-1). The “hello” string ownership is transferred into the closure env and released by partial_0_drop. The lambda borrows both strings (no RC ops) — excellent borrow elision. [NOTE-2]

3. Attributes & Calling Convention

FunctionfastccnounwindnoaliasnoundefcoldNotes
@check_captureYESYESN/AYESNO
@mainNO (C)YESN/AYESNOC ABI (entry)
@__lambdaYESYESN/AYESNOnonnull+deref on params
@partial_0_dropN/AYESN/AYESYESDrop fn, correctly cold
@_ori_drop$3N/AYESN/AYESYESstr drop, correctly cold
@partial_1N/AYESN/AYESNOShim, correctly not cold

All 21 applicable attribute checks pass. The cold attribute on drop functions (partial_0_drop and _ori_drop$3) is correct — drop paths are infrequent. The nonnull dereferenceable(24) on lambda parameters correctly indicates the str triple layout. ori_str_rc_dec has memory(inaccessiblemem: readwrite) — correctly indicating it only touches RC metadata. 100% compliance. [NOTE-3]

4. Control Flow & Block Layout

FunctionBlocksEmpty BlocksRedundant BranchesPhi NodesNotes
@check_capture3000env null check
@main1000
@__lambda3000Overflow check
@partial_0_drop1000Branchless via select
@_ori_drop$31000
@partial_11000

The 3-block structure in check_capture is clean: bb0 (main path) performs string creation, closure setup, indirect call, ori_str_rc_dec (runtime handles SSO check), then branches on env null check to rc_dec.do or rc_dec.skip. Compared to the previous inline SSO check (5 blocks), the ori_str_rc_dec runtime call consolidation is an improvement — fewer blocks, same semantics. The partial_0_drop uses a select instruction for branchless SSO handling — more efficient than branching.

5. Overflow Checking

Status: PASS

OperationCheckedCorrectNotes
add (str lengths)YESYESUses llvm.sadd.with.overflow.i64

The addition of two string lengths uses checked overflow. While string lengths cannot realistically overflow i64, this is correct safety behavior.

6. Binary Analysis

MetricValue
Binary size6.34 MiB (debug)
.text section891 KiB
.rodata section134 KiB
User code~350 bytes (6 functions)
Runtime>99% of binary

Disassembly: @__lambda_check_capture_0

_ori___lambda_check_capture_0:
  sub    $0x18,%rsp
  mov    %rsi,(%rsp)
  call   ori_str_len               ; prefix.length()
  mov    (%rsp),%rdi
  mov    %rax,0x8(%rsp)
  call   ori_str_len               ; s.length()
  mov    %rax,%rcx
  mov    0x8(%rsp),%rax
  add    %rcx,%rax                 ; 5 + 5
  jo     .panic                    ; overflow check
  add    $0x18,%rsp
  ret

Disassembly: @partial_1

_ori_partial_1:
  push   %rax
  add    $0x8,%rdi                 ; GEP past drop_fn to captured str
  call   _ori___lambda_check_capture_0
  pop    %rcx
  ret

7. Optimal IR Comparison

@check_capture: Ideal vs Actual

; IDEAL (28 instructions -- same as actual)
; Every instruction serves a purpose:
; - 3 alloca (sret tmp for "hello", sret tmp for "world", icall arg tmp)
; - 2 call ori_str_from_raw (creating "hello" and "world")
; - 2 load (str triples from sret allocas)
; - 1 call ori_rc_alloc (closure env)
; - 2 GEP (drop_fn + captured str into env)
; - 3 store (drop_fn, captured str, world str arg)
; - 1 insertvalue (partial_apply pair)
; - 3 extractvalue (fn_ptr, env_ptr, fat_data, fat_cap from str, env from pair)
; - 1 indirect call (closure invocation)
; - 1 call ori_str_rc_dec (SSO-aware str cleanup in runtime)
; - 1 extractvalue + 1 ptrtoint + 1 icmp + 1 br (env null check)
; - 1 load + 1 call ori_rc_dec + 1 br (env teardown path)
; - 1 ret

Delta: +0 instructions. The ori_str_rc_dec runtime call replaces the previous inline SSO-check sequence, saving 6 instructions and 2 blocks.

@__lambda_check_capture_0: Ideal vs Actual

; IDEAL (9 instructions)
define fastcc noundef i64 @_ori___lambda_check_capture_0(
    ptr noundef nonnull dereferenceable(24) %0,
    ptr noundef nonnull dereferenceable(24) %1) nounwind {
  %str.len = call i64 @ori_str_len(ptr %0)
  %str.len1 = call i64 @ori_str_len(ptr %1)
  %add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %str.len, i64 %str.len1)
  %add.val = extractvalue { i64, i1 } %add, 0
  %add.ovf = extractvalue { i64, i1 } %add, 1
  br i1 %add.ovf, label %add.ovf_panic, label %add.ok
add.ok:
  ret i64 %add.val
add.ovf_panic:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable
}
; ACTUAL (10 instructions -- +1 dead load)
; Includes: %param.load = load { i64, i64, ptr }, ptr %1, align 8  (DEAD)
; Remaining 9 instructions identical to ideal

Delta: +1 instruction (dead param.load — not harmful, removed by LLVM opt)

@partial_0_drop: Ideal vs Actual

; IDEAL (12 instructions -- same as actual)
; SSO-check + select + unconditional rc_dec + rc_free is correct and tight

Delta: +0 instructions

@_ori_drop$3: Ideal vs Actual

; IDEAL (2 instructions -- same as actual)
; rc_free + ret -- minimal str data drop

Delta: +0 instructions

@partial_1: Ideal vs Actual

; IDEAL (3 instructions -- same as actual)
; GEP + call + ret -- minimal forwarding shim

Delta: +0 instructions

Module Summary

FunctionIdealActualDeltaJustifiedVerdict
@check_capture2828+0N/AOPTIMAL
@main22+0N/AOPTIMAL
@__lambda910+1NO (dead load)NEAR-OPTIMAL
@partial_0_drop1212+0N/AOPTIMAL
@_ori_drop$322+0N/AOPTIMAL
@partial_133+0N/AOPTIMAL

8. Closures: Fat Pointer Capture

The closure captures prefix: str, a fat pointer represented as { i64, i64, ptr } (len, cap, data_ptr). The capture flow is:

  1. Create string: ori_str_from_raw writes the str triple to stack via sret
  2. Allocate env: ori_rc_alloc(32, 8) — 8 bytes for drop_fn + 24 bytes for str triple
  3. Store drop_fn: GEP to field 0, store @_ori_partial_0_drop
  4. Store captured str: GEP to field 1, store the full { i64, i64, ptr } triple
  5. Create pair: insertvalue builds { fn_ptr, env_ptr }

The env layout { ptr, { i64, i64, ptr } } is clean — drop_fn at offset 0 (consistent convention), captured data immediately following. The 32-byte allocation is exactly right: 8 (drop_fn) + 8 (len) + 8 (cap) + 8 (data_ptr) = 32.

9. Closures: SSO-Aware Cleanup

The string RC dec now uses two distinct strategies, split between the hot path and the cold drop path:

  • check_capture (for “world” str): Uses ori_str_rc_dec(data, cap, drop_fn) — a runtime function that handles the SSO/null check internally. This is cleaner than the previous inline approach: fewer IR instructions, single function call, same semantics. The runtime function has memory(inaccessiblemem: readwrite) — correctly indicating it only touches RC metadata.

  • partial_0_drop (for captured “hello” str): Uses a selectselect i1 %skip, ptr null, ptr %data to conditionally null out the pointer, then always calls ori_rc_dec. The runtime handles null gracefully. This is the correct approach for cold drop paths — branchless, simple.

The split is intentional: hot-path str cleanup uses a dedicated runtime function (ori_str_rc_dec), while cold-path env drop uses the generic select + ori_rc_dec pattern. Both are correct.

10. Closures: Calling Convention

The indirect call convention is well-designed:

  • partial_1 (forwarding shim): Receives (env_ptr, arg_ptr), GEPs past the drop_fn to the captured str pointer, calls the lambda with (captured_str_ptr, arg_ptr). This is a 3-instruction shim — minimal overhead.

  • Lambda: Takes two ptr parameters (both nonnull dereferenceable(24)), calls ori_str_len on each. The lambda borrows both strings — no RC ops needed.

  • Argument passing: The “world” string is passed by pointer via icall.arg.tmp alloca. This avoids the aggregate-by-value issue that can cause problems with LLVM’s FastISel in JIT mode.

Findings

#SeverityCategoryDescriptionStatusFirst Seen
1LOWIR QualityDead param.load in lambdaCONFIRMEDJ17
2NOTEARCExcellent borrow elision on lambda parametersCONFIRMEDJ17
3NOTEAttributes100% attribute compliance, correct cold on drop fnsCONFIRMEDJ17
4NOTEClosuresClean env layout, correct SSO-aware cleanupCONFIRMEDJ17
5NOTECodegenstr RC dec moved to ori_str_rc_dec runtime call — 6 fewer IR instructionsNEWJ17

LOW-1: Dead param.load in lambda

Location: @_ori___lambda_check_capture_0, first instruction Impact: 1 unnecessary 24-byte load in debug mode; removed by LLVM optimization passes Fix: Skip emitting the parameter load when the aggregate value is not needed (only pointer used) First seen: Journey 17 Found in: Instruction Purity (Category 1), Optimal IR Comparison (Category 7)

NOTE-2: Excellent borrow elision on lambda parameters

Location: @_ori___lambda_check_capture_0 Impact: Positive — both the captured prefix and the parameter s are borrowed (passed by pointer), avoiding 2 rc_inc + 2 rc_dec operations per call. The str data is never copied. Found in: ARC Purity (Category 2)

NOTE-3: 100% attribute compliance

Location: All functions Impact: Positive — nounwind on all user functions, cold on drop paths (partial_0_drop and _ori_drop$3), noundef on return values, nonnull dereferenceable(24) on str parameters, fastcc on internal functions, C calling convention on main and closure shims (required for indirect calls). memory(inaccessiblemem: readwrite) on ori_rc_dec and ori_str_rc_dec. Found in: Attributes & Calling Convention (Category 3)

NOTE-4: Clean closure environment design

Location: Closure env layout and lifecycle Impact: Positive — env is exactly 32 bytes (no padding waste), drop_fn at offset 0 enables uniform cleanup, SSO-aware string cleanup prevents RC operations on small strings, ownership transfer of captured str eliminates redundant RC operations. Found in: Closures: Fat Pointer Capture (Category 8)

NOTE-5: Runtime-consolidated str RC dec

Location: @_ori_check_capture, str cleanup for “world” Impact: Positive — the ori_str_rc_dec(data, cap, drop_fn) runtime call replaces what was previously an inline SSO-check sequence (6 instructions + 2 extra blocks). This reduces IR complexity while maintaining identical semantics. The memory(inaccessiblemem: readwrite) attribute allows LLVM to optimize around the call. Found in: Closures: SSO-Aware Cleanup (Category 9)

Codegen Quality Score

CategoryWeightScoreNotes
Instruction Efficiency15%10/101.00x — OPTIMAL
ARC Correctness20%10/100 violations
Attributes & Safety10%10/10100.0% compliance
Control Flow10%10/100 defects
IR Quality20%10/100 unjustified instructions
Binary Quality10%10/100 defects
Other Findings15%9/101 low

Overall: 9.8 / 10

Verdict

Journey 17’s fat-pointer closure capture produces near-perfect codegen. The closure environment layout is tight (32 bytes, zero padding), ownership transfer of the captured string into the environment is correct, and the lambda achieves full borrow elision on both parameters. The str RC dec for the “world” argument now uses the consolidated ori_str_rc_dec runtime call (down from inline SSO-check blocks), reducing check_capture from 34 to 28 instructions and from 5 to 3 blocks. The only blemish is a dead param.load instruction in the lambda body, which LLVM optimization will eliminate. This journey validates that the compiler correctly handles fat pointer capture — a critical feature intersection that previously caused crashes.

Cross-Journey Observations

FeatureFirst TestedThis JourneyStatus
Closure captureJ5J17CONFIRMED
fastcc on internal fnsJ1J17CONFIRMED
nounwind on user fnsJ5J17CONFIRMED
Overflow checkingJ1J17CONFIRMED
SSO-aware RC decJ9J17CONFIRMED
Fat pointer in closuresN/AJ17CONFIRMED
ori_str_rc_dec runtimeN/AJ17NEW

Journey 5 captured an int (scalar) — no ARC needed for the capture. Journey 17 captures a str (fat pointer) — requiring heap allocation for the environment, ownership transfer, and SSO-aware cleanup. The ori_str_rc_dec consolidation (new since last analysis) reduces IR complexity: the SSO check is now handled by the runtime rather than being inlined at every str cleanup site. This is a positive architectural improvement that benefits all str-using code paths.