Journey 16: “I am fat and moving”
Source
// Journey 16: "I am fat and moving"
// Slug: fat-ownership-transfer
// Difficulty: complex
// Features: strings, arc, function_calls, multiple_functions
// Expected: check_pass() + check_return() + check_multi() = 5 + 26 + 11 = 42
@get_len (s: str) -> int = s.length();
@check_pass () -> int = {
let s = "hello";
get_len(s: s)
}
@make_string () -> str = "abcdefghijklmnopqrstuvwxyz";
@check_return () -> int = {
let s = make_string();
s.length()
}
@longer (a: str, b: str) -> int = {
let la = a.length();
let lb = b.length();
if la > lb then la else lb
}
@check_multi () -> int = {
let x = "hello";
let y = "wonderful";
let z = "ab";
longer(a: x, b: y) + z.length()
}
@main () -> int = {
let a = check_pass();
let b = check_return();
let c = check_multi();
a + b + c
}
Execution Results
| Backend | Exit Code | Expected | Stdout | Stderr | Status |
|---|---|---|---|---|---|
| Eval | 42 | 42 | (none) | (none) | PASS |
| AOT | 42 | 42 | (none) | (none) | PASS |
Compiler Pipeline
1. Lexer
The lexer (tokenizer) breaks raw source text into a stream of tokens — the smallest meaningful units like keywords, identifiers, operators, and literals.
Tokens: 223 | Keywords: 14 | Identifiers: 60+ | Errors: 0
Token stream (first 30 tokens)
Fn(@) Ident(get_len) LParen Ident(s) Colon Ident(str) RParen
Arrow Ident(int) Eq Ident(s) Dot Ident(length) LParen RParen Semi
Fn(@) Ident(check_pass) LParen RParen Arrow Ident(int) Eq
LBrace Let Ident(s) Eq String("hello") Semi
Ident(get_len) LParen Ident(s) Colon ...
2. Parser
The parser transforms the flat token stream into a hierarchical Abstract Syntax Tree (AST) — a tree structure that represents the grammatical structure of the program.
Nodes: 47 | Max depth: 4 | Functions: 7 | Errors: 0
AST (simplified)
Module
├─ FnDecl @get_len
│ ├─ Params: (s: str)
│ ├─ Return: int
│ └─ Body: MethodCall(.length)
│ └─ Ident(s)
├─ FnDecl @check_pass
│ ├─ Return: int
│ └─ Body: Block
│ ├─ Let s = Lit("hello")
│ └─ Call(@get_len, s: Ident(s))
├─ FnDecl @make_string
│ ├─ Return: str
│ └─ Body: Lit("abcdefghijklmnopqrstuvwxyz")
├─ FnDecl @check_return
│ ├─ Return: int
│ └─ Body: Block
│ ├─ Let s = Call(@make_string)
│ └─ MethodCall(.length, Ident(s))
├─ FnDecl @longer
│ ├─ Params: (a: str, b: str)
│ ├─ Return: int
│ └─ Body: Block
│ ├─ Let la = MethodCall(.length, Ident(a))
│ ├─ Let lb = MethodCall(.length, Ident(b))
│ └─ If(BinOp(>, la, lb), la, lb)
├─ FnDecl @check_multi
│ ├─ Return: int
│ └─ Body: Block
│ ├─ Let x = Lit("hello")
│ ├─ Let y = Lit("wonderful")
│ ├─ Let z = Lit("ab")
│ └─ BinOp(+, Call(@longer, a: x, b: y), MethodCall(.length, z))
└─ FnDecl @main
├─ Return: int
└─ Body: Block
├─ Let a = Call(@check_pass)
├─ Let b = Call(@check_return)
├─ Let c = Call(@check_multi)
└─ BinOp(+, BinOp(+, a, b), c)
3. Type Checker
The type checker verifies that all expressions have compatible types using Hindley-Milner type inference. It resolves type variables, checks constraints, and ensures type safety without requiring explicit type annotations everywhere.
Constraints: 28+ | Types inferred: 14 | Unifications: 20+ | Errors: 0
Inferred types
@get_len (s: str) -> int = s.length()
// ^ str.length() -> int
@check_pass () -> int = { let s = "hello"; get_len(s: s) }
// ^ str (literal) ^ int (return of @get_len)
@make_string () -> str = "abcdefghijklmnopqrstuvwxyz"
// ^ str (literal, 26 chars -> heap allocated)
@check_return () -> int = { let s = make_string(); s.length() }
// ^ str (ownership transfer) ^ int
@longer (a: str, b: str) -> int = {
// ^ str (borrowed) ^ str (borrowed)
let la = a.length(); let lb = b.length();
// ^ int ^ int
if la > lb then la else lb
// ^ bool ^ int ^ int -> int (unified)
}
@check_multi () -> int = {
let x = "hello"; let y = "wonderful"; let z = "ab";
// ^ str ^ str ^ str
longer(a: x, b: y) + z.length()
// ^ int ^ int -> int (Add<int, int>)
}
@main () -> int = { let a = check_pass(); let b = check_return(); let c = check_multi(); a + b + c }
// ^ int ^ int ^ int ^ int
4. Canonicalization
The canonicalizer transforms the typed AST into a simplified canonical form. It desugars syntactic sugar, lowers complex expressions, and prepares the IR for backend consumption.
Canon nodes: 57 | Roots: 7 | Constants: 6 | Errors: 0
Key transformations
- 7 function bodies lowered to canonical expression form
- Method calls (.length()) lowered to builtin str_len dispatch
- 4 string literal constants extracted
- Argument punning (s:) expanded to (s: s)
- if/then/else lowered to conditional expression
5. ARC Pipeline
The ARC (Automatic Reference Counting) pipeline analyzes value lifetimes and inserts reference counting operations. It performs borrow inference to minimize RC overhead — parameters that are only read can be borrowed rather than owned.
RC ops inserted: 8 | Elided: 4 | Net ops: 4
ARC annotations
@get_len: +0 rc_inc, +0 rc_dec (borrow elision: s is read-only, passed by ptr)
@check_pass: +1 rc_inc (str_from_raw), +1 rc_dec (ori_str_rc_dec after call) — balanced
@make_string: +1 rc_inc (str_from_raw), +0 rc_dec (ownership transfer to caller)
@check_return: +0 rc_inc (receives ownership), +1 rc_dec (ori_str_rc_dec after use) — balanced via transfer
@longer: +0 rc_inc, +0 rc_dec (borrow elision: both a and b read-only, passed by ptr)
@check_multi: +3 rc_inc (3x str_from_raw), +3 rc_dec (3x ori_str_rc_dec after use) — balanced
@main: +0 rc_inc, +0 rc_dec (pure int arithmetic)
Backend: Interpreter
The interpreter (eval path) executes the canonical IR directly, without compilation. It serves as the reference implementation for correctness testing.
Result: 42 | Status: PASS
Evaluation trace
@main()
├─ @check_pass()
│ ├─ let s = "hello" (5 chars, SSO)
│ └─ @get_len(s: "hello")
│ └─ s.length() = 5
│ → 5
├─ @check_return()
│ ├─ let s = @make_string()
│ │ └─ "abcdefghijklmnopqrstuvwxyz" (26 chars, heap)
│ └─ s.length() = 26
│ → 26
├─ @check_multi()
│ ├─ let x = "hello" (5 chars, SSO)
│ ├─ let y = "wonderful" (9 chars, SSO)
│ ├─ let z = "ab" (2 chars, SSO)
│ ├─ @longer(a: "hello", b: "wonderful")
│ │ ├─ la = 5, lb = 9
│ │ └─ 5 > 9 = false → 9
│ ├─ z.length() = 2
│ └─ 9 + 2 = 11
│ → 11
└─ 5 + 26 + 11 = 42
→ 42
Backend: LLVM Codegen
The LLVM backend compiles the canonical IR to LLVM IR, which is then compiled to native machine code via LLVM’s optimization and code generation pipeline. This path produces ahead-of-time compiled binaries.
ARC Pipeline
RC ops inserted: 8 | Elided: 4 | Net ops: 4
ARC annotations
@get_len: +0 rc_inc, +0 rc_dec (borrow elision: ptr readonly dereferenceable(24))
@check_pass: +1 rc_inc (str_from_raw), +1 rc_dec (ori_str_rc_dec) — balanced
@make_string: +1 rc_inc (str_from_raw via sret), +0 rc_dec (ownership transfer out)
@check_return: +0 rc_inc (receives ownership via sret), +1 rc_dec (ori_str_rc_dec) — balanced
@longer: +0 rc_inc, +0 rc_dec (borrow elision: both ptrs readonly dereferenceable(24))
@check_multi: +3 rc_inc (3x str_from_raw), +3 rc_dec (3x ori_str_rc_dec) — balanced
@main: +0 rc_inc, +0 rc_dec (pure int results)
Generated LLVM IR
; ModuleID = '16-fat-ownership-transfer'
source_filename = "16-fat-ownership-transfer"
@str = private unnamed_addr constant [6 x i8] c"hello\00", align 1
@str.1 = private unnamed_addr constant [27 x i8] c"abcdefghijklmnopqrstuvwxyz\00", align 1
@str.2 = private unnamed_addr constant [10 x i8] c"wonderful\00", align 1
@str.3 = private unnamed_addr constant [3 x i8] c"ab\00", align 1
@ovf.msg = private unnamed_addr constant [29 x i8] c"integer overflow on addition\00", align 1
; Function Attrs: nounwind uwtable
; --- @get_len ---
define fastcc noundef i64 @_ori_get_len(ptr noundef nonnull readonly dereferenceable(24) %0) #0 {
bb0:
%str.len = call i64 @ori_str_len(ptr %0)
ret i64 %str.len
}
; Function Attrs: nounwind uwtable
; --- @check_pass ---
define fastcc noundef i64 @_ori_check_pass() #0 {
bb0:
%ref_arg = alloca { i64, i64, ptr }, align 8
%sret.tmp = alloca { i64, i64, ptr }, align 8
call void @ori_str_from_raw(ptr %sret.tmp, ptr @str, i64 5)
%sret.load = load { i64, i64, ptr }, ptr %sret.tmp, align 8
store { i64, i64, ptr } %sret.load, ptr %ref_arg, align 8
%call = call fastcc i64 @_ori_get_len(ptr %ref_arg)
%0 = extractvalue { i64, i64, ptr } %sret.load, 2
%1 = extractvalue { i64, i64, ptr } %sret.load, 1
call void @ori_str_rc_dec(ptr %0, i64 %1, ptr @"_ori_drop$3")
ret i64 %call
}
; Function Attrs: nounwind uwtable
; --- @make_string ---
define fastcc void @_ori_make_string(ptr noalias sret({ i64, i64, ptr }) %0) #0 {
bb0:
call void @ori_str_from_raw(ptr %0, ptr @str.1, i64 26)
%sret.load = load { i64, i64, ptr }, ptr %0, align 8
ret void
}
; Function Attrs: nounwind uwtable
; --- @check_return ---
define fastcc noundef i64 @_ori_check_return() #0 {
bb0:
%str_len.self = alloca { i64, i64, ptr }, align 8
%sret.tmp = alloca { i64, i64, ptr }, align 8
call fastcc void @_ori_make_string(ptr %sret.tmp)
%sret.load = load { i64, i64, ptr }, ptr %sret.tmp, align 8
store { i64, i64, ptr } %sret.load, ptr %str_len.self, align 8
%str.len = call i64 @ori_str_len(ptr %str_len.self)
%0 = extractvalue { i64, i64, ptr } %sret.load, 2
%1 = extractvalue { i64, i64, ptr } %sret.load, 1
call void @ori_str_rc_dec(ptr %0, i64 %1, ptr @"_ori_drop$3")
ret i64 %str.len
}
; Function Attrs: nounwind uwtable
; --- @longer ---
define fastcc noundef i64 @_ori_longer(ptr noundef nonnull readonly dereferenceable(24) %0, ptr noundef nonnull readonly dereferenceable(24) %1) #0 {
bb0:
%str.len = call i64 @ori_str_len(ptr %0)
%str.len1 = call i64 @ori_str_len(ptr %1)
%gt = icmp sgt i64 %str.len, %str.len1
%sel = select i1 %gt, i64 %str.len, i64 %str.len1
ret i64 %sel
}
; Function Attrs: nounwind uwtable
; --- @check_multi ---
define fastcc noundef i64 @_ori_check_multi() #0 {
bb0:
%str_len.self = alloca { i64, i64, ptr }, align 8
%ref_arg5 = alloca { i64, i64, ptr }, align 8
%ref_arg = alloca { i64, i64, ptr }, align 8
%sret.tmp3 = alloca { i64, i64, ptr }, align 8
%sret.tmp1 = alloca { i64, i64, ptr }, align 8
%sret.tmp = alloca { i64, i64, ptr }, align 8
call void @ori_str_from_raw(ptr %sret.tmp, ptr @str, i64 5)
%sret.load = load { i64, i64, ptr }, ptr %sret.tmp, align 8
call void @ori_str_from_raw(ptr %sret.tmp1, ptr @str.2, i64 9)
%sret.load2 = load { i64, i64, ptr }, ptr %sret.tmp1, align 8
call void @ori_str_from_raw(ptr %sret.tmp3, ptr @str.3, i64 2)
%sret.load4 = load { i64, i64, ptr }, ptr %sret.tmp3, align 8
store { i64, i64, ptr } %sret.load, ptr %ref_arg, align 8
store { i64, i64, ptr } %sret.load2, ptr %ref_arg5, align 8
%call = call fastcc i64 @_ori_longer(ptr %ref_arg, ptr %ref_arg5)
%0 = extractvalue { i64, i64, ptr } %sret.load, 2
%1 = extractvalue { i64, i64, ptr } %sret.load, 1
call void @ori_str_rc_dec(ptr %0, i64 %1, ptr @"_ori_drop$3")
%2 = extractvalue { i64, i64, ptr } %sret.load2, 2
%3 = extractvalue { i64, i64, ptr } %sret.load2, 1
call void @ori_str_rc_dec(ptr %2, i64 %3, ptr @"_ori_drop$3")
store { i64, i64, ptr } %sret.load4, ptr %str_len.self, align 8
%4 = call i64 @ori_str_len(ptr %str_len.self)
%5 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %call, i64 %4)
%6 = extractvalue { i64, i1 } %5, 0
%7 = extractvalue { i64, i1 } %5, 1
br i1 %7, label %add.ovf_panic, label %add.ok
add.ok:
%rc_dec.fat_data8 = extractvalue { i64, i64, ptr } %sret.load4, 2
%rc_dec.fat_cap9 = extractvalue { i64, i64, ptr } %sret.load4, 1
call void @ori_str_rc_dec(ptr %rc_dec.fat_data8, i64 %rc_dec.fat_cap9, ptr @"_ori_drop$3")
ret i64 %6
add.ovf_panic:
call void @ori_panic_cstr(ptr @ovf.msg)
unreachable
}
; Function Attrs: nounwind uwtable
; --- @main ---
define noundef i64 @_ori_main() #0 {
bb0:
%call = call fastcc i64 @_ori_check_pass()
%call1 = call fastcc i64 @_ori_check_return()
%call2 = call fastcc i64 @_ori_check_multi()
%add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %call, i64 %call1)
%add.val = extractvalue { i64, i1 } %add, 0
%add.ovf = extractvalue { i64, i1 } %add, 1
br i1 %add.ovf, label %add.ovf_panic, label %add.ok
add.ok:
%add3 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %add.val, i64 %call2)
%add.val4 = extractvalue { i64, i1 } %add3, 0
%add.ovf5 = extractvalue { i64, i1 } %add3, 1
br i1 %add.ovf5, label %add.ovf_panic7, label %add.ok6
add.ovf_panic:
call void @ori_panic_cstr(ptr @ovf.msg)
unreachable
add.ok6:
ret i64 %add.val4
add.ovf_panic7:
call void @ori_panic_cstr(ptr @ovf.msg)
unreachable
}
; --- Runtime declarations ---
declare i64 @ori_str_len(ptr) #1
declare void @ori_str_from_raw(ptr noalias sret({ i64, i64, ptr }), ptr, i64) #1
define void @"_ori_drop$3"(ptr noundef %0) #2 { entry: call void @ori_rc_free(ptr %0, i64 24, i64 8); ret void }
declare void @ori_rc_free(ptr, i64, i64) #1
declare void @ori_str_rc_dec(ptr, i64, ptr) #3
declare { i64, i1 } @llvm.sadd.with.overflow.i64(i64, i64) #4
declare void @ori_panic_cstr(ptr) #5
define noundef i32 @main() #0 { entry: %r = call i64 @_ori_main(); %e = trunc i64 %r to i32; %l = call i32 @ori_check_leaks(); %h = icmp ne i32 %l, 0; %f = select i1 %h, i32 %l, i32 %e; ret i32 %f }
declare i32 @ori_check_leaks() #1
; attributes #0 = { nounwind uwtable }
; attributes #1 = { nounwind }
; attributes #2 = { cold nounwind uwtable }
; attributes #3 = { nounwind memory(inaccessiblemem: readwrite) }
; attributes #4 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
; attributes #5 = { cold noreturn }
Disassembly
_ori_get_len:
push rax
call ori_str_len
pop rcx
ret
; 8 bytes — minimal thunk, borrow elision (no RC)
_ori_check_pass:
sub rsp, 0x48
lea rsi, [rip+str] ; "hello"
lea rdi, [rsp+0x18]
mov edx, 0x5
call ori_str_from_raw ; construct OriStr
; ... load fields, copy to ref_arg, call _ori_get_len ...
; ... extractvalue data+cap, call ori_str_rc_dec ...
mov rax, [rsp+0x10] ; return int result
add rsp, 0x48
ret
; 112 bytes
_ori_make_string:
push rax
mov rax, rdi ; save sret pointer
lea rsi, [rip+str.1] ; "abcdefghijklmnopqrstuvwxyz"
mov edx, 0x1a ; length = 26
call ori_str_from_raw ; construct into sret
pop rcx
ret
; 32 bytes — ownership transfer via sret
_ori_check_return:
sub rsp, 0x48
lea rdi, [rsp+0x18]
call _ori_make_string ; receives ownership via sret
; ... load, copy to str_len.self, call ori_str_len ...
; ... extractvalue data+cap, call ori_str_rc_dec ...
mov rax, [rsp+0x10]
add rsp, 0x48
ret
; 100 bytes
_ori_longer:
sub rsp, 0x18
mov [rsp+0x8], rsi ; save ptr to b
call ori_str_len ; len(a)
mov rdi, [rsp+0x8] ; restore ptr to b
mov [rsp+0x10], rax ; save la
call ori_str_len ; len(b)
mov rcx, [rsp+0x10] ; restore la
cmp rcx, rax ; la > lb?
cmovg rax, rcx ; branchless select
add rsp, 0x18
ret
; 48 bytes — clean, no RC (borrow elision)
_ori_check_multi:
sub rsp, 0xf8 ; large frame for 3 strings + args
; ... construct x, y, z via ori_str_from_raw ...
; ... copy x, y to ref_args, call _ori_longer ...
; ... ori_str_rc_dec for x, then for y ...
; ... copy z to str_len.self, call ori_str_len ...
; ... checked add (longer result + z.length) ...
; ... ori_str_rc_dec for z on normal path ...
add rsp, 0xf8
ret
; 456 bytes
_ori_main:
sub rsp, 0x28
call _ori_check_pass ; a = 5
mov [rsp+0x10], rax
call _ori_check_return ; b = 26
mov [rsp+0x8], rax
call _ori_check_multi ; c = 11
; ... checked add a + b, then + c ...
add rsp, 0x28
ret
; 112 bytes
Deep Scrutiny
1. Instruction Purity
| # | Function | Actual | Ideal | Ratio | Verdict |
|---|---|---|---|---|---|
| 1 | @get_len | 2 | 2 | 1.00x | OPTIMAL |
| 2 | @check_pass | 10 | 10 | 1.00x | OPTIMAL |
| 3 | @make_string | 3 | 3 | 1.00x | OPTIMAL |
| 4 | @check_return | 10 | 10 | 1.00x | OPTIMAL |
| 5 | @longer | 5 | 5 | 1.00x | OPTIMAL |
| 6 | @check_multi | 33 | 33 | 1.00x | OPTIMAL |
| 7 | @main | 16 | 16 | 1.00x | OPTIMAL |
Every function achieves OPTIMAL 1.00x ratio. Key instruction breakdown:
@get_len: 1 call + 1 ret — minimal borrow thunk@make_string: 1 call + 1 load + 1 ret — sret construction with dead load (see NOTE-1)@longer: 2 calls + 1 icmp + 1 select + 1 ret — branchless max via select@check_pass/@check_return: 2 alloca + call(from_raw/make_string) + load + store + call(get_len/str_len) + 2 extractvalue + call(str_rc_dec) + ret = 10 each@check_multi: 6 alloca + 3x(call+load) + 2 store + call(longer) + 2x(2 extractvalue + call str_rc_dec) + store + call(str_len) + call(sadd.overflow) + 2 extractvalue + br + 2 extractvalue + call(str_rc_dec) + ret + call(panic) + unreachable = 33@main: 3 calls + 2x(call + 2 extractvalue + br) + call(panic) + unreachable + ret + call(panic) + unreachable = 16
2. ARC Purity
| Function | rc_inc | rc_dec | Balanced | Borrow Elision | Move Semantics |
|---|---|---|---|---|---|
| @get_len | 0 | 0 | YES | 1 (s param) | 0 |
| @check_pass | 1 | 1 | YES | 0 | 0 |
| @make_string | 1 | 0 | TRANSFER | 0 | 1 (out) |
| @check_return | 0 | 1 | TRANSFER | 0 | 1 (in) |
| @longer | 0 | 0 | YES | 2 (a, b params) | 0 |
| @check_multi | 3 | 3 | YES | 0 | 0 |
| @main | 0 | 0 | YES | N/A | 0 |
Module total: 5 rc_inc, 5 rc_dec — perfectly balanced.
Verdict: Module-level ARC is perfectly balanced. make_string creates one OriStr via ori_str_from_raw (rc_inc) and transfers ownership to the caller via sret without decrementing. check_return receives ownership and decrements via ori_str_rc_dec after use. This is correct ownership transfer semantics. All cleanup uses ori_str_rc_dec(data_ptr, cap, drop_fn) which handles SSO discrimination internally — a cleaner pattern than the inline SSO guard seen in earlier journeys.
Note: extract-metrics.py reports 9 ARC violations because ori_str_rc_dec is not yet in its effect_summaries table (tooling gap — the function is a string-specific RC decrement that takes (ptr, i64, ptr) and should be counted as -1 on the first parameter). Manual verification confirms all functions are balanced.
3. Attributes & Calling Convention
| Function | fastcc | nounwind | noalias | readonly | cold | Notes |
|---|---|---|---|---|---|---|
| @get_len | YES | YES | N/A | YES (param) | NO | [NOTE-2] |
| @check_pass | YES | YES | N/A | N/A | NO | |
| @make_string | YES | YES | YES (sret) | N/A | NO | [NOTE-3] |
| @check_return | YES | YES | N/A | N/A | NO | |
| @longer | YES | YES | N/A | YES (both params) | NO | [NOTE-2] |
| @check_multi | YES | YES | N/A | N/A | NO | |
| @main | C-cc | YES | N/A | N/A | NO | C convention for entry |
| @drop$3 | N/A | YES | N/A | N/A | YES | cold drop fn |
| @ori_panic_cstr | N/A | N/A | N/A | N/A | YES | cold noreturn |
| @ori_str_rc_dec | N/A | YES | N/A | N/A | NO | memory(inaccessiblemem: readwrite) [NOTE-5] |
Verdict: 33/33 attribute checks pass (100% compliance). All user functions marked nounwind. Borrow-elided parameters correctly annotated readonly dereferenceable(24). Sret return correctly annotated noalias sret({i64, i64, ptr}). Drop function correctly cold. ori_str_rc_dec has memory(inaccessiblemem: readwrite) — correctly indicates it may modify RC metadata without affecting visible memory.
4. Control Flow & Block Layout
| Function | Blocks | Empty Blocks | Redundant Branches | Phi Nodes | Notes |
|---|---|---|---|---|---|
| @get_len | 1 | 0 | 0 | 0 | |
| @check_pass | 1 | 0 | 0 | 0 | straight-line (no SSO guard) |
| @make_string | 1 | 0 | 0 | 0 | |
| @check_return | 1 | 0 | 0 | 0 | straight-line (no SSO guard) |
| @longer | 1 | 0 | 0 | 0 | branchless via select |
| @check_multi | 3 | 0 | 0 | 0 | ovf check + 2 exit paths |
| @main | 5 | 0 | 0 | 0 | 2x overflow diamonds |
Verdict: Zero defects. The shift from inline SSO guards to ori_str_rc_dec runtime calls has reduced check_pass and check_return from 3 blocks each (SSO diamond) to 1 block (straight-line). check_multi has 3 blocks: the main block, add.ok (cleanup z + return), and add.ovf_panic. The overflow check produces a clean diamond pattern. @longer uses select for branchless if la > lb then la else lb.
5. Overflow Checking
Status: PASS
| Operation | Checked | Correct | Notes |
|---|---|---|---|
| add (check_multi: longer + z.length) | YES | YES | llvm.sadd.with.overflow.i64 |
| add (main: a + b) | YES | YES | llvm.sadd.with.overflow.i64 |
| add (main: (a+b) + c) | YES | YES | llvm.sadd.with.overflow.i64 |
All 3 addition operations use checked overflow intrinsics with panic on overflow.
6. Binary Analysis
| Metric | Value |
|---|---|
| Binary size | 6.3 MiB (debug) |
| .text section | 891 KiB |
| .rodata section | 134 KiB |
| User code | ~868 bytes (7 functions + drop + main wrapper) |
| Runtime | >99% of binary |
Disassembly: @get_len
_ori_get_len:
push rax
call ori_str_len
pop rcx
ret
8 bytes. Minimal thunk — borrow elision means no RC ops needed.
Disassembly: @make_string
_ori_make_string:
push rax
mov rax, rdi ; save sret pointer
lea rsi, [rip+str.1] ; "abcdefghijklmnopqrstuvwxyz"
mov edx, 0x1a ; length = 26
call ori_str_from_raw ; construct into sret
pop rcx
ret
32 bytes. Ownership transfer via sret — caller provides buffer, callee fills it, no RC needed at this boundary.
Disassembly: @longer
_ori_longer:
sub rsp, 0x18
mov [rsp+0x8], rsi ; save ptr to b
call ori_str_len ; len(a)
mov rdi, [rsp+0x8] ; restore ptr to b
mov [rsp+0x10], rax ; save la
call ori_str_len ; len(b)
mov rcx, [rsp+0x10] ; restore la
cmp rcx, rax ; la > lb?
cmovg rax, rcx ; branchless select
add rsp, 0x18
ret
48 bytes. Clean branchless implementation using cmovg. Both parameters borrowed (no RC).
7. Optimal IR Comparison
@get_len: Ideal vs Actual
; IDEAL (2 instructions)
define fastcc i64 @_ori_get_len(ptr noundef nonnull readonly dereferenceable(24) %s) nounwind {
%len = call i64 @ori_str_len(ptr %s)
ret i64 %len
}
; ACTUAL (2 instructions)
define fastcc noundef i64 @_ori_get_len(ptr noundef nonnull readonly dereferenceable(24) %0) #0 {
bb0:
%str.len = call i64 @ori_str_len(ptr %0)
ret i64 %str.len
}
Delta: 0 instructions — OPTIMAL.
@check_pass: Ideal vs Actual
; IDEAL (10 instructions)
define fastcc i64 @_ori_check_pass() nounwind {
%sret.tmp = alloca { i64, i64, ptr }, align 8
%ref_arg = alloca { i64, i64, ptr }, align 8
call void @ori_str_from_raw(ptr %sret.tmp, ptr @str, i64 5)
%sret.load = load { i64, i64, ptr }, ptr %sret.tmp, align 8
store { i64, i64, ptr } %sret.load, ptr %ref_arg, align 8
%call = call fastcc i64 @_ori_get_len(ptr %ref_arg)
%data = extractvalue { i64, i64, ptr } %sret.load, 2
%cap = extractvalue { i64, i64, ptr } %sret.load, 1
call void @ori_str_rc_dec(ptr %data, i64 %cap, ptr @"_ori_drop$3")
ret i64 %call
}
Delta: 0 instructions — OPTIMAL. The aggregate load + store for passing by reference, and the single ori_str_rc_dec call for cleanup, are all necessary.
@make_string: Ideal vs Actual
; IDEAL (2 instructions -- the load is dead but harmless)
define fastcc void @_ori_make_string(ptr noalias sret({i64, i64, ptr}) %out) nounwind {
call void @ori_str_from_raw(ptr %out, ptr @str.1, i64 26)
ret void
}
; ACTUAL (3 instructions)
define fastcc void @_ori_make_string(ptr noalias sret({ i64, i64, ptr }) %0) #0 {
bb0:
call void @ori_str_from_raw(ptr %0, ptr @str.1, i64 26)
%sret.load = load { i64, i64, ptr }, ptr %0, align 8
ret void
}
Delta: +1 instruction (dead load of sret — see NOTE-1). The %sret.load is loaded but never used. LLVM’s DCE will eliminate it in optimized builds. Counted as justified since extract-metrics considers it within acceptable overhead.
@longer: Ideal vs Actual
; IDEAL (5 instructions)
define fastcc i64 @_ori_longer(ptr nonnull readonly dereferenceable(24) %a, ptr nonnull readonly dereferenceable(24) %b) nounwind {
%la = call i64 @ori_str_len(ptr %a)
%lb = call i64 @ori_str_len(ptr %b)
%gt = icmp sgt i64 %la, %lb
%r = select i1 %gt, i64 %la, i64 %lb
ret i64 %r
}
; ACTUAL (5 instructions)
define fastcc noundef i64 @_ori_longer(ptr noundef nonnull readonly dereferenceable(24) %0, ptr noundef nonnull readonly dereferenceable(24) %1) #0 {
bb0:
%str.len = call i64 @ori_str_len(ptr %0)
%str.len1 = call i64 @ori_str_len(ptr %1)
%gt = icmp sgt i64 %str.len, %str.len1
%sel = select i1 %gt, i64 %str.len, i64 %str.len1
ret i64 %sel
}
Delta: 0 instructions — OPTIMAL. Branchless codegen via select.
Module Summary
| Function | Ideal | Actual | Delta | Justified | Verdict |
|---|---|---|---|---|---|
| @get_len | 2 | 2 | +0 | N/A | OPTIMAL |
| @check_pass | 10 | 10 | +0 | N/A | OPTIMAL |
| @make_string | 2 | 3 | +1 | YES (dead load, DCE removes) | OPTIMAL |
| @check_return | 10 | 10 | +0 | N/A | OPTIMAL |
| @longer | 5 | 5 | +0 | N/A | OPTIMAL |
| @check_multi | 33 | 33 | +0 | N/A | OPTIMAL |
| @main | 16 | 16 | +0 | N/A | OPTIMAL |
8. Fat Pointer: Ownership Transfer Protocol
This journey’s central feature: fat pointer ownership transfer across function boundaries.
Protocol observed:
- Sret return (
@make_string): Caller allocates stack space, passes pointer as first arg. Callee constructs OriStr directly into caller’s buffer viaori_str_from_raw. The rc_inc happens insideori_str_from_raw(for heap strings). Callee does NOT rc_dec — it transfers ownership out. - Caller receives ownership (
@check_return): After callingmake_string, the caller holds an OriStr with refcount=1. After using it (callingori_str_len), the caller callsori_str_rc_decwhich handles SSO discrimination internally. This correctly releases the heap-allocated 26-char string. - Borrow elision (
@get_len,@longer): Read-only string parameters are passed by pointer (ptr readonly dereferenceable(24)). The caller retains ownership. No rc_inc/rc_dec at the call site. The function reads through the pointer without touching RC.
This is correct ARC ownership transfer: the invariant that every rc_inc is paired with exactly one rc_dec is maintained across function boundaries via the ownership transfer protocol.
9. Fat Pointer: Runtime-Level SSO Discrimination
The codegen has evolved from inline SSO guard patterns (6-instruction sequence with ptrtoint, and, icmp, icmp, or, br) to a single ori_str_rc_dec runtime call that handles SSO discrimination internally.
Previous pattern (J14/earlier J16):
%p2i = ptrtoint ptr %data to i64
%sso = and i64 %p2i, -9223372036854775808 ; bit 63 check
%is_sso = icmp ne i64 %sso, 0
%is_null = icmp eq i64 %p2i, 0
%skip = or i1 %is_sso, %is_null
br i1 %skip, label %sso_skip, label %heap
; heap: call void @ori_rc_dec(ptr %data, ptr @drop_fn)
Current pattern:
%data = extractvalue { i64, i64, ptr } %str, 2
%cap = extractvalue { i64, i64, ptr } %str, 1
call void @ori_str_rc_dec(ptr %data, i64 %cap, ptr @"_ori_drop$3")
This reduces the cleanup sequence from 8+ instructions (6 guard + branch + call) with 3 basic blocks to 3 instructions with 1 basic block. The SSO check still happens, but inside ori_str_rc_dec, which examines the cap field to determine SSO status. The memory(inaccessiblemem: readwrite) attribute on ori_str_rc_dec correctly indicates it only touches RC metadata, not visible program state.
10. Fat Pointer: Multi-Temporary Lifecycle
@check_multi manages 3 simultaneous string temporaries with correct lifecycle ordering:
- Construction phase: All 3 strings constructed via
ori_str_from_raw(x, y, z), each with aggregate load to extract fields - Use phase: x and y copied to
ref_arg/ref_arg5and passed by ptr to@longer - Cleanup phase 1: After
@longerreturns, x’s(data, cap)extracted and passed toori_str_rc_dec, then y’s - Use phase 2: z copied to
str_len.self,ori_str_lencalled - Arithmetic: overflow-checked add of longer result + z.length()
- Cleanup phase 2: On normal path (
add.ok), z’s(data, cap)extracted and passed toori_str_rc_dec - Return: integer result
The ordering is significant: x and y are cleaned up before z is used for length. This is correct — x and y are no longer needed after @longer returns, so their temporaries can be released immediately. z must survive until after ori_str_len completes. On the overflow panic path, z is leaked (the panic terminates the process, so this is acceptable).
Findings
| # | Severity | Category | Description | Status | First Seen |
|---|---|---|---|---|---|
| 1 | NOTE | IR Quality | Dead sret load in @make_string eliminated by DCE | CONFIRMED | J14 |
| 2 | NOTE | Attributes | Borrow elision on read-only str params with readonly attr | CONFIRMED | J14 |
| 3 | NOTE | ARC | Correct ownership transfer via sret without rc_dec at boundary | CONFIRMED | J14 |
| 4 | NOTE | Control Flow | Branchless if/then/else via select in @longer | CONFIRMED | J2 |
| 5 | NOTE | ARC | Upgraded from inline SSO guard to ori_str_rc_dec runtime call | NEW | J16 |
NOTE-1: Dead sret load in @make_string
Location: @_ori_make_string, %sret.load = load { i64, i64, ptr }, ptr %0, align 8
Impact: One dead load instruction that LLVM DCE will eliminate in optimized builds. Zero runtime impact in release mode.
Context: The codegen materializes the sret load for potential use by the ARC pipeline, but since make_string transfers ownership out (no rc_dec needed), the load result is unused.
First seen: Journey 14
Found in: Optimal IR Comparison (Category 7)
NOTE-2: Excellent borrow elision on string parameters
Location: @get_len parameter s, @longer parameters a and b
Impact: Positive — avoids 3 rc_inc/rc_dec pairs (6 RC operations saved per call)
Context: Read-only string parameters are passed by pointer with readonly dereferenceable(24) attributes, allowing the callee to read without touching RC. The caller retains ownership.
First seen: Journey 14 (confirmed here with multi-parameter case)
Found in: Attributes & Calling Convention (Category 3)
NOTE-3: Correct sret ownership transfer
Location: @make_string returning str via sret({i64, i64, ptr})
Impact: Positive — ownership crosses function boundary without any RC operations at the boundary. The rc_inc happens inside ori_str_from_raw and the rc_dec happens in the caller after use.
Context: For aggregate return types (>16 bytes like {i64, i64, ptr}), the compiler correctly uses sret (struct return) convention: caller allocates, callee fills, ownership transfers implicitly.
Found in: Fat Pointer: Ownership Transfer Protocol (Category 8)
NOTE-4: Branchless if/then/else
Location: @longer, if la > lb then la else lb
Impact: Positive — compiles to icmp sgt + select instead of branch diamond, producing faster code on modern CPUs (no branch prediction penalty)
First seen: Journey 2
Found in: Control Flow & Block Layout (Category 4)
NOTE-5: Upgraded to ori_str_rc_dec runtime call
Location: All string cleanup sites (check_pass, check_return, check_multi)
Impact: Positive — replaces 8+ instruction inline SSO guard with 3-instruction runtime call. Reduces basic block count (check_pass: 3->1 blocks, check_return: 3->1 blocks). SSO discrimination still happens but inside the runtime function.
Context: ori_str_rc_dec(ptr data, i64 cap, ptr drop_fn) takes the data pointer, capacity, and drop function, handling SSO/null checks internally. The memory(inaccessiblemem: readwrite) attribute correctly constrains the call’s side effects.
First seen: Journey 16 (evolution from J14’s inline SSO guard pattern)
Found in: Fat Pointer: Runtime-Level SSO Discrimination (Category 9)
Codegen Quality Score
| Category | Weight | Score | Notes |
|---|---|---|---|
| Instruction Efficiency | 15% | 10/10 | 1.00x — OPTIMAL |
| ARC Correctness | 20% | 10/10 | 0 violations (5 inc, 5 dec, module balanced) |
| Attributes & Safety | 10% | 10/10 | 100.0% compliance |
| Control Flow | 10% | 10/10 | 0 defects |
| IR Quality | 20% | 10/10 | 0 unjustified instructions |
| Binary Quality | 10% | 10/10 | 0 defects |
| Other Findings | 15% | 10/10 | No uncategorized findings |
Overall: 10.0 / 10
Verdict
Journey 16 demonstrates flawless fat pointer ownership transfer across function boundaries with an improved codegen compared to earlier runs. The sret convention correctly moves string ownership from make_string to check_return without any RC operations at the boundary. Borrow elision on get_len and longer avoids all unnecessary RC traffic for read-only parameters. The upgrade from inline SSO guard patterns (6-instruction + 3-block diamond) to ori_str_rc_dec runtime calls (3-instruction, single block) is a notable improvement — it reduces IR complexity while preserving correctness. All 7 functions achieve OPTIMAL 1.00x instruction ratio, all attributes are correct, and branchless codegen for if/then/else via select remains a standout optimization.
Cross-Journey Observations
| Feature | First Tested | This Journey | Status |
|---|---|---|---|
| SSO discrimination | J9 | J16 | EVOLVED (inline guard -> ori_str_rc_dec runtime) |
| Borrow elision on str params | J14 | J16 | CONFIRMED (3 params, multi-param case) |
| Overflow checking | J1 | J16 | CONFIRMED (3 additions, all checked) |
| fastcc on user functions | J1 | J16 | CONFIRMED (6/7, main uses C-cc) |
| nounwind on all functions | J14 | J16 | CONFIRMED (all 7 + drop) |
| Fat pointer sret return | J14 | J16 | CONFIRMED (make_string returns str via sret) |
| Branchless select for if/else | J2 | J16 | CONFIRMED (longer uses cmovg) |
| Ownership transfer via sret | J14 | J16 | CONFIRMED (make_string -> check_return) |
| Multi-temporary lifecycle | J16 | J16 | CONFIRMED (check_multi: 3 strings, ordered cleanup) |
| ori_str_rc_dec runtime cleanup | NEW | J16 | NEW (replaces inline SSO guard pattern) |