All Journeys
Journey #09 Complex

I am a string

String creation, .length() method calls, ARC lifecycle with SSO guards, boolean logic with constant folding

10
Score
PASS Status
13 Expected
PASS Overflow

What you'll learn

  • See how string literals are lowered to heap-allocated OriStr via ori_str_from_raw and ori_str_empty
  • Understand the SSO (Small String Optimization) guard pattern in RC cleanup
  • Compare ARC lifecycle for strings: allocation, borrowing, and conditional RC decrement
  • Observe how boolean short-circuit operators compile to constant propagation
  • See nounwind analysis correctly propagate through string-handling functions

Score Breakdown

stringsstring methodsarcbranchingfunction callsmultiple functionslet bindings

Journey 9: “I am a string”

Source

// Journey 9: "I am a string"
// Slug: strings
// Difficulty: complex
// Features: strings, string_methods, arc, branching
// Expected: check_logic() + check_strings() = 2 + 11 = 13

@bool_to_int (b: bool) -> int = if b then 1 else 0;

@check_logic () -> int = {
    let a = true && true;
    let b = true && false;
    let c = false || true;
    let d = false || false;
    bool_to_int(b: a) + bool_to_int(b: b) + bool_to_int(b: c) + bool_to_int(b: d)
}

@check_strings () -> int = {
    let s1 = "hello";
    let s2 = "world!";
    let s3 = "";
    s1.length() + s2.length() + s3.length()
}

@main () -> int = {
    let a = check_logic();
    let b = check_strings();
    a + b
}

Execution Results

BackendExit CodeExpectedStdoutStderrStatus
Eval1313(none)(none)PASS
AOT1313(none)(none)PASS

Compiler Pipeline

1. Lexer

The lexer (tokenizer) breaks raw source text into a stream of tokens — the smallest meaningful units like keywords, identifiers, operators, and literals.

Tokens: 179 | Keywords: 16 | Identifiers: 38 | Errors: 0

Token stream (first 30 tokens)
Fn(@) Ident(bool_to_int) LParen Ident(b) Colon Ident(bool) RParen
Arrow Ident(int) Eq If Ident(b) Then Int(1) Else Int(0) Semi
Fn(@) Ident(check_logic) LParen RParen Arrow Ident(int) Eq
LBrace Let Ident(a) Eq True AndAnd True Semi

2. Parser

The parser transforms the flat token stream into a hierarchical Abstract Syntax Tree (AST) — a tree structure that represents the grammatical structure of the program.

Nodes: 52 | Max depth: 5 | Functions: 4 | Errors: 0

AST (simplified)
Module
+-  FnDecl @bool_to_int
|  +-  Params: (b: bool)
|  +-  Return: int
|  +-- Body: If(Ident(b), Int(1), Int(0))
+-  FnDecl @check_logic
|  +-  Return: int
|  +-- Body: Block
|       +-  Let a = BinOp(&&, true, true)
|       +-  Let b = BinOp(&&, true, false)
|       +-  Let c = BinOp(||, false, true)
|       +-  Let d = BinOp(||, false, false)
|       +-- BinOp(+, BinOp(+, BinOp(+, Call(@bool_to_int, a), Call(@bool_to_int, b)), Call(@bool_to_int, c)), Call(@bool_to_int, d))
+-  FnDecl @check_strings
|  +-  Return: int
|  +-- Body: Block
|       +-  Let s1 = Str("hello")
|       +-  Let s2 = Str("world!")
|       +-  Let s3 = Str("")
|       +-- BinOp(+, BinOp(+, MethodCall(s1, length), MethodCall(s2, length)), MethodCall(s3, length))
+-- FnDecl @main
   +-  Return: int
   +-- Body: Block
        +-  Let a = Call(@check_logic)
        +-  Let b = Call(@check_strings)
        +-- BinOp(+, a, b)

3. Type Checker

The type checker verifies that all expressions have compatible types using Hindley-Milner type inference. It resolves type variables, checks constraints, and ensures type safety without requiring explicit type annotations everywhere.

Constraints: 24 | Types inferred: 12 | Unifications: 18 | Errors: 0

Inferred types
@bool_to_int (b: bool) -> int = if b then 1 else 0
//                                        ^ int (literal)
//                                              ^ int (literal)
//                                ^ int (if-then-else unified)

@check_logic () -> int = {
    let a = true && true        // a: bool (short-circuit AND)
    let b = true && false       // b: bool
    let c = false || true       // c: bool (short-circuit OR)
    let d = false || false      // d: bool
    bool_to_int(b: a) + bool_to_int(b: b) + bool_to_int(b: c) + bool_to_int(b: d)
    //                 ^ int (Add<int, int> -> int)
}

@check_strings () -> int = {
    let s1 = "hello"            // s1: str
    let s2 = "world!"           // s2: str
    let s3 = ""                 // s3: str
    s1.length() + s2.length() + s3.length()
    // ^ int      ^ int          ^ int
    //          ^ int (Add<int, int> -> int)
}

@main () -> int = {
    let a = check_logic()       // a: int
    let b = check_strings()     // b: int
    a + b                       // int (Add<int, int> -> int)
}

4. Canonicalization

The canonicalizer transforms the typed AST into a simplified canonical form. It desugars syntactic sugar, lowers complex expressions, and prepares the IR for backend consumption.

Transforms: 4 | Desugared: 4 | Errors: 0

Key transformations
- Boolean && / || desugared to constant values (compile-time evaluation)
  true && true -> true, true && false -> false
  false || true -> true, false || false -> false
- .length() method calls lowered to runtime call ori_str_len
- Empty string "" lowered to ori_str_empty() call
- Function bodies lowered to canonical expression form

5. ARC Pipeline

The ARC (Automatic Reference Counting) pipeline analyzes value lifetimes and inserts reference counting operations. It performs borrow inference to minimize RC overhead — parameters that are only read can be borrowed rather than owned.

RC ops inserted: 6 | Elided: 0 | Net ops: 6

ARC annotations
@bool_to_int: no heap values -- pure scalar logic
@check_logic: no heap values -- pure scalar arithmetic
@check_strings: +3 rc_inc (implicit from ori_str_from_raw/ori_str_empty), +3 rc_dec (conditional SSO cleanup)
@main: no heap values -- delegates to check_logic/check_strings

Backend: Interpreter

The interpreter (eval path) executes the canonical IR directly, without compilation. It serves as the reference implementation for correctness testing.

Result: 13 | Status: PASS

Evaluation trace
@main()
  +-- @check_logic()
       +-- let a = true && true -> true
       +-- let b = true && false -> false
       +-- let c = false || true -> true
       +-- let d = false || false -> false
       +-- bool_to_int(b: true) -> 1
       +-- bool_to_int(b: false) -> 0
       +-- bool_to_int(b: true) -> 1
       +-- bool_to_int(b: false) -> 0
       +-- 1 + 0 + 1 + 0 = 2
  +-- @check_strings()
       +-- let s1 = "hello"
       +-- let s2 = "world!"
       +-- let s3 = ""
       +-- s1.length() -> 5
       +-- s2.length() -> 6
       +-- s3.length() -> 0
       +-- 5 + 6 + 0 = 11
  +-- 2 + 11 = 13
-> 13

Backend: LLVM Codegen

The LLVM backend compiles the canonical IR to LLVM IR, which is then compiled to native machine code via LLVM’s optimization and code generation pipeline. This path produces ahead-of-time compiled binaries.

ARC Pipeline

RC ops inserted: 6 | Elided: 0 | Net ops: 6

ARC annotations
@bool_to_int: +0 rc_inc, +0 rc_dec (no heap values)
@check_logic: +0 rc_inc, +0 rc_dec (no heap values -- boolean constants folded)
@check_strings: +3 rc_inc (from ori_str_from_raw/ori_str_empty), +3 rc_dec (conditional via SSO guard)
@main: +0 rc_inc, +0 rc_dec (delegates to helpers)

Generated LLVM IR

; ModuleID = '09-strings'
source_filename = "09-strings"

@ovf.msg = private unnamed_addr constant [29 x i8] c"integer overflow on addition\00", align 1
@str = private unnamed_addr constant [6 x i8] c"hello\00", align 1
@str.1 = private unnamed_addr constant [7 x i8] c"world!\00", align 1

; Function Attrs: nounwind memory(none) uwtable
; --- @bool_to_int ---
define fastcc noundef i64 @_ori_bool_to_int(i1 noundef %0) #0 {
bb0:
  %sel = select i1 %0, i64 1, i64 0
  ret i64 %sel
}

; Function Attrs: nounwind uwtable
; --- @check_logic ---
define fastcc noundef i64 @_ori_check_logic() #1 {
bb0:
  %call = call fastcc i64 @_ori_bool_to_int(i1 true)
  %call1 = call fastcc i64 @_ori_bool_to_int(i1 false)
  %add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %call, i64 %call1)
  %add.val = extractvalue { i64, i1 } %add, 0
  %add.ovf = extractvalue { i64, i1 } %add, 1
  br i1 %add.ovf, label %add.ovf_panic, label %add.ok

add.ok:
  %call2 = call fastcc i64 @_ori_bool_to_int(i1 true)
  %add3 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %add.val, i64 %call2)
  %add.val4 = extractvalue { i64, i1 } %add3, 0
  %add.ovf5 = extractvalue { i64, i1 } %add3, 1
  br i1 %add.ovf5, label %add.ovf_panic7, label %add.ok6

add.ovf_panic:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable

add.ok6:
  %call8 = call fastcc i64 @_ori_bool_to_int(i1 false)
  %add9 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %add.val4, i64 %call8)
  %add.val10 = extractvalue { i64, i1 } %add9, 0
  %add.ovf11 = extractvalue { i64, i1 } %add9, 1
  br i1 %add.ovf11, label %add.ovf_panic13, label %add.ok12

add.ovf_panic7:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable

add.ok12:
  ret i64 %add.val10

add.ovf_panic13:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable
}

; Function Attrs: nounwind uwtable
; --- @check_strings ---
define fastcc noundef i64 @_ori_check_strings() #1 {
bb0:
  %str_len.self15 = alloca { i64, i64, ptr }, align 8
  %str_len.self5 = alloca { i64, i64, ptr }, align 8
  %str_len.self = alloca { i64, i64, ptr }, align 8
  %sret.tmp3 = alloca { i64, i64, ptr }, align 8
  %sret.tmp1 = alloca { i64, i64, ptr }, align 8
  %sret.tmp = alloca { i64, i64, ptr }, align 8
  call void @ori_str_from_raw(ptr %sret.tmp, ptr @str, i64 5)
  %sret.load = load { i64, i64, ptr }, ptr %sret.tmp, align 8
  call void @ori_str_from_raw(ptr %sret.tmp1, ptr @str.1, i64 6)
  %sret.load2 = load { i64, i64, ptr }, ptr %sret.tmp1, align 8
  call void @ori_str_empty(ptr %sret.tmp3)
  %sret.load4 = load { i64, i64, ptr }, ptr %sret.tmp3, align 8
  store { i64, i64, ptr } %sret.load, ptr %str_len.self, align 8
  %str.len = call i64 @ori_str_len(ptr %str_len.self)
  %0 = extractvalue { i64, i64, ptr } %sret.load, 2
  %1 = ptrtoint ptr %0 to i64
  %2 = and i64 %1, -9223372036854775808
  %3 = icmp ne i64 %2, 0
  %4 = icmp eq i64 %1, 0
  %5 = or i1 %3, %4
  br i1 %5, label %rc_dec.sso_skip, label %rc_dec.heap

rc_dec.heap:
  call void @ori_rc_dec(ptr %0, ptr @"_ori_drop$3")  ; RC-- str
  br label %rc_dec.sso_skip

rc_dec.sso_skip:
  store { i64, i64, ptr } %sret.load2, ptr %str_len.self5, align 8
  %str.len6 = call i64 @ori_str_len(ptr %str_len.self5)
  %6 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %str.len, i64 %str.len6)
  %7 = extractvalue { i64, i1 } %6, 0
  %8 = extractvalue { i64, i1 } %6, 1
  br i1 %8, label %add.ovf_panic, label %add.ok

add.ok:
  %rc_dec.fat_data7 = extractvalue { i64, i64, ptr } %sret.load2, 2
  %rc_dec.p2i10 = ptrtoint ptr %rc_dec.fat_data7 to i64
  %rc_dec.sso_flag11 = and i64 %rc_dec.p2i10, -9223372036854775808
  %rc_dec.is_sso12 = icmp ne i64 %rc_dec.sso_flag11, 0
  %rc_dec.is_null13 = icmp eq i64 %rc_dec.p2i10, 0
  %rc_dec.skip_rc14 = or i1 %rc_dec.is_sso12, %rc_dec.is_null13
  br i1 %rc_dec.skip_rc14, label %rc_dec.sso_skip9, label %rc_dec.heap8

add.ovf_panic:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable

rc_dec.heap8:
  call void @ori_rc_dec(ptr %rc_dec.fat_data7, ptr @"_ori_drop$3")  ; RC-- str
  br label %rc_dec.sso_skip9

rc_dec.sso_skip9:
  store { i64, i64, ptr } %sret.load4, ptr %str_len.self15, align 8
  %str.len16 = call i64 @ori_str_len(ptr %str_len.self15)
  %9 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %7, i64 %str.len16)
  %10 = extractvalue { i64, i1 } %9, 0
  %11 = extractvalue { i64, i1 } %9, 1
  br i1 %11, label %add.ovf_panic21, label %add.ok20

add.ok20:
  %rc_dec.fat_data22 = extractvalue { i64, i64, ptr } %sret.load4, 2
  %rc_dec.p2i25 = ptrtoint ptr %rc_dec.fat_data22 to i64
  %rc_dec.sso_flag26 = and i64 %rc_dec.p2i25, -9223372036854775808
  %rc_dec.is_sso27 = icmp ne i64 %rc_dec.sso_flag26, 0
  %rc_dec.is_null28 = icmp eq i64 %rc_dec.p2i25, 0
  %rc_dec.skip_rc29 = or i1 %rc_dec.is_sso27, %rc_dec.is_null28
  br i1 %rc_dec.skip_rc29, label %rc_dec.sso_skip24, label %rc_dec.heap23

add.ovf_panic21:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable

rc_dec.heap23:
  call void @ori_rc_dec(ptr %rc_dec.fat_data22, ptr @"_ori_drop$3")  ; RC-- str
  br label %rc_dec.sso_skip24

rc_dec.sso_skip24:
  ret i64 %10
}

; Function Attrs: nounwind uwtable
; --- @main ---
define noundef i64 @_ori_main() #1 {
bb0:
  %call = call fastcc i64 @_ori_check_logic()
  %call1 = call fastcc i64 @_ori_check_strings()
  %add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %call, i64 %call1)
  %add.val = extractvalue { i64, i1 } %add, 0
  %add.ovf = extractvalue { i64, i1 } %add, 1
  br i1 %add.ovf, label %add.ovf_panic, label %add.ok

add.ok:
  ret i64 %add.val

add.ovf_panic:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare { i64, i1 } @llvm.sadd.with.overflow.i64(i64, i64) #2

; Function Attrs: cold noreturn
declare void @ori_panic_cstr(ptr) #3

; Function Attrs: nounwind
declare void @ori_str_from_raw(ptr noalias sret({ i64, i64, ptr }), ptr, i64) #4

; Function Attrs: nounwind
declare void @ori_str_empty(ptr noalias sret({ i64, i64, ptr })) #4

; Function Attrs: nounwind
declare i64 @ori_str_len(ptr) #4

; Function Attrs: cold nounwind uwtable
; --- drop str ---
define void @"_ori_drop$3"(ptr noundef %0) #5 {
entry:
  call void @ori_rc_free(ptr %0, i64 24, i64 8)
  ret void
}

; Function Attrs: nounwind
declare void @ori_rc_free(ptr, i64, i64) #4

; Function Attrs: nounwind memory(inaccessiblemem: readwrite)
declare void @ori_rc_dec(ptr, ptr) #6

; Function Attrs: nounwind uwtable
define noundef i32 @main() #1 {
entry:
  %ori_main_result = call i64 @_ori_main()
  %exit_code = trunc i64 %ori_main_result to i32
  %leak_check = call i32 @ori_check_leaks()
  %has_leak = icmp ne i32 %leak_check, 0
  %final_exit = select i1 %has_leak, i32 %leak_check, i32 %exit_code
  ret i32 %final_exit
}

; Function Attrs: nounwind
declare i32 @ori_check_leaks() #4

attributes #0 = { nounwind memory(none) uwtable }
attributes #1 = { nounwind uwtable }
attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
attributes #3 = { cold noreturn }
attributes #4 = { nounwind }
attributes #5 = { cold nounwind uwtable }
attributes #6 = { nounwind memory(inaccessiblemem: readwrite) }

Disassembly

_ori_bool_to_int:
  mov    %dil,%dl
  xor    %eax,%eax
  mov    $0x1,%ecx
  test   $0x1,%dl
  cmovne %rcx,%rax
  ret

_ori_check_logic:
  sub    $0x28,%rsp
  mov    $0x1,%edi
  call   _ori_bool_to_int
  mov    %rax,0x18(%rsp)
  xor    %edi,%edi
  call   _ori_bool_to_int
  mov    %rax,%rcx
  mov    0x18(%rsp),%rax
  add    %rcx,%rax
  mov    %rax,0x20(%rsp)
  seto   %al
  jo     .overflow_1
  mov    $0x1,%edi
  call   _ori_bool_to_int
  mov    %rax,%rcx
  mov    0x20(%rsp),%rax
  add    %rcx,%rax
  mov    %rax,0x10(%rsp)
  seto   %al
  jo     .overflow_2
  jmp    .cont
  .overflow_1:
  lea    ovf.msg(%rip),%rdi
  call   ori_panic_cstr
  .cont:
  xor    %edi,%edi
  call   _ori_bool_to_int
  mov    %rax,%rcx
  mov    0x10(%rsp),%rax
  add    %rcx,%rax
  mov    %rax,0x8(%rsp)
  seto   %al
  jo     .overflow_3
  jmp    .ret
  .overflow_2:
  lea    ovf.msg(%rip),%rdi
  call   ori_panic_cstr
  .ret:
  mov    0x8(%rsp),%rax
  add    $0x28,%rsp
  ret
  .overflow_3:
  lea    ovf.msg(%rip),%rdi
  call   ori_panic_cstr

_ori_check_strings:
  sub    $0xf8,%rsp
  ; ori_str_from_raw("hello", 5) -> sret at 0x68(%rsp)
  lea    str(%rip),%rsi
  lea    0x68(%rsp),%rdi
  mov    $0x5,%edx
  call   ori_str_from_raw
  ; load 3 fields via aggregate store/load shuffle
  mov    0x78(%rsp),%rax         ; field 2 (ptr)
  mov    %rax,0x58(%rsp)
  mov    0x68(%rsp),%rax         ; field 0
  mov    %rax,0x38(%rsp)
  mov    0x70(%rsp),%rax         ; field 1
  mov    %rax,0x30(%rsp)
  ; ori_str_from_raw("world!", 6) -> sret at 0x80(%rsp)
  lea    str.1(%rip),%rsi
  lea    0x80(%rsp),%rdi
  mov    $0x6,%edx
  call   ori_str_from_raw
  ; load s2 fields
  mov    0x90(%rsp),%rax
  mov    %rax,0x18(%rsp)
  mov    0x80(%rsp),%rax
  mov    %rax,0x20(%rsp)
  mov    0x88(%rsp),%rax
  mov    %rax,0x28(%rsp)
  ; ori_str_empty() -> sret at 0x98(%rsp)
  lea    0x98(%rsp),%rdi
  call   ori_str_empty
  ; load s3 fields and store s1 for str_len
  ; ... (field shuffles for str_len args)
  lea    0xb0(%rsp),%rdi
  call   ori_str_len              ; s1.length()
  ; SSO guard for s1: check high bit + null
  mov    0x58(%rsp),%rcx
  movabs $0x8000000000000000,%rdx
  mov    %rcx,%rax
  and    %rdx,%rax
  cmp    $0x0,%rax
  setne  %al
  cmp    $0x0,%rcx
  sete   %cl
  or     %cl,%al
  test   $0x1,%al
  jne    .sso_skip_1
  mov    0x58(%rsp),%rdi
  lea    _ori_drop$3(%rip),%rsi
  call   ori_rc_dec               ; RC-- s1
  .sso_skip_1:
  ; store s2 for str_len
  lea    0xc8(%rsp),%rdi
  call   ori_str_len              ; s2.length()
  ; overflow-checked s1.len + s2.len
  add    %rcx,%rax
  seto   %al
  jo     .overflow
  ; SSO guard for s2
  ; ... (same pattern)
  call   ori_rc_dec               ; RC-- s2
  ; store s3 for str_len
  lea    0xe0(%rsp),%rdi
  call   ori_str_len              ; s3.length()
  ; overflow-checked (s1.len + s2.len) + s3.len
  add    %rcx,%rax
  seto   %al
  jo     .overflow
  ; SSO guard for s3
  ; ... (same pattern)
  call   ori_rc_dec               ; RC-- s3
  mov    0x8(%rsp),%rax
  add    $0xf8,%rsp
  ret

_ori_main:
  sub    $0x18,%rsp
  call   _ori_check_logic
  mov    %rax,0x8(%rsp)
  call   _ori_check_strings
  mov    %rax,%rcx
  mov    0x8(%rsp),%rax
  add    %rcx,%rax
  mov    %rax,0x10(%rsp)
  seto   %al
  jo     .overflow
  mov    0x10(%rsp),%rax
  add    $0x18,%rsp
  ret

main:
  push   %rax
  call   _ori_main
  mov    %eax,0x4(%rsp)
  call   ori_check_leaks
  mov    %eax,%ecx
  mov    0x4(%rsp),%eax
  cmp    $0x0,%ecx
  cmovne %ecx,%eax
  pop    %rcx
  ret

Deep Scrutiny

1. Instruction Purity

#FunctionActualIdealRatioVerdict
1@bool_to_int221.00xOPTIMAL
2@check_logic23231.00xOPTIMAL
3@check_strings58581.00xOPTIMAL
4@main991.00xOPTIMAL

@bool_to_int: OPTIMAL. select i1 %0, i64 1, i64 0 + ret — the ideal lowering for if b then 1 else 0. No branches, just a conditional select.

@check_logic: OPTIMAL. Boolean &&/|| on constants are folded to true/false at compile time, then passed as constant arguments to bool_to_int. Three overflow-checked additions are necessary for the sum chain. All 23 instructions are justified.

@check_strings: OPTIMAL. The 58 instructions break down as: 6 allocas for string sret buffers, 3 string construction calls (ori_str_from_raw x2, ori_str_empty x1), 3 single aggregate loads (load { i64, i64, ptr }), 3 store + ori_str_len call sequences (6 total), 3 SSO-guarded RC decrements (7 instructions each: extractvalue + ptrtoint + and + icmp ne + icmp eq + or + br = 21 total), 3 RC dec heap blocks (call + br = 6 total), 2 overflow-checked additions (call + 2 extractvalue + br = 8 total), 2 overflow panic blocks (call + unreachable = 4 total), and 1 ret. All instructions structurally justified.

@main: OPTIMAL. Two calls + one overflow-checked add + ret.

2. ARC Purity

Functionrc_incrc_decBalancedBorrow ElisionMove Semantics
@bool_to_int00YESN/AN/A
@check_logic00YESN/AN/A
@check_strings33YES0 elided0 moves
@main00YESN/AN/A

Module-level: Balanced. All 3 strings created in @check_strings are properly cleaned up via conditional RC decrement. The RC decrements are correctly guarded by the SSO (Small String Optimization) check: strings <= 23 bytes are stored inline and require no heap deallocation.

For “hello” (5 bytes) and “world!” (6 bytes), both fit within SSO. The empty string "" also uses a special inline representation. In all three cases, the SSO guard will skip the ori_rc_dec call at runtime, but the guard itself is correct safety infrastructure.

Verdict: All functions balanced. No leaks detected. ARC is OPTIMAL for the string lifecycle.

3. Attributes & Calling Convention

FunctionfastccnounwinduwtablenoundefcoldNotes
@bool_to_intYESYESYESYESN/Amemory(none) — excellent [NOTE-1]
@check_logicYESYESYESYESN/A
@check_stringsYESYESYESYESN/A[NOTE-2]
@mainNOYESYESYESN/AC calling convention (entry point)
@_ori_drop$3N/AYESYESYESYESAll attributes present
@ori_panic_cstrN/AN/AN/AN/AYEScold noreturn — correct

100% attribute compliance (19/19 applicable attributes correct).

@bool_to_int has the ideal attribute set: nounwind memory(none) — the compiler correctly identified this function as pure (no memory access, cannot unwind).

@check_strings now correctly has nounwind. The nounwind fixed-point analysis correctly determined that all callees (ori_str_from_raw, ori_str_empty, ori_str_len, ori_rc_dec, ori_rc_free) are declared nounwind, making check_strings itself nounwind. This is an improvement over the previous run where check_strings and main were missing nounwind. [NOTE-2]

4. Control Flow & Block Layout

FunctionBlocksEmpty BlocksRedundant BranchesPhi NodesNotes
@bool_to_int1000
@check_logic7000
@check_strings11000
@main3000

@check_strings has 11 blocks: 1 entry, 3 SSO guard diamonds (check + heap-path + skip-path = 3x2 = 6), 2 overflow-checked adds (ok + panic = 2x2 = 4). Zero defects.

5. Overflow Checking

Status: PASS

OperationCheckedCorrectNotes
add (check_logic, 3x)YESYESllvm.sadd.with.overflow.i64
add (check_strings, 2x)YESYESllvm.sadd.with.overflow.i64
add (main, 1x)YESYESllvm.sadd.with.overflow.i64

All 6 integer additions use llvm.sadd.with.overflow.i64 with correct panic-on-overflow branching.

6. Binary Analysis

MetricValue
Binary size6.3 MiB (debug)
.text section885.4 KiB
.rodata section133.8 KiB
User code~155 instructions (~580 bytes)
Runtime>99% of binary

Disassembly: @bool_to_int

_ori_bool_to_int:
  mov    %dil,%dl
  xor    %eax,%eax
  mov    $0x1,%ecx
  test   $0x1,%dl
  cmovne %rcx,%rax
  ret

Compact 6-instruction implementation using cmovne for branchless bool-to-int conversion.

Disassembly: @main

_ori_main:
  sub    $0x18,%rsp
  call   _ori_check_logic
  mov    %rax,0x8(%rsp)
  call   _ori_check_strings
  mov    %rax,%rcx
  mov    0x8(%rsp),%rax
  add    %rcx,%rax
  mov    %rax,0x10(%rsp)
  seto   %al
  jo     .overflow
  mov    0x10(%rsp),%rax
  add    $0x18,%rsp
  ret

7. Optimal IR Comparison

@bool_to_int: Ideal vs Actual

; IDEAL (2 instructions)
define fastcc noundef i64 @_ori_bool_to_int(i1 noundef %0) nounwind memory(none) {
  %sel = select i1 %0, i64 1, i64 0
  ret i64 %sel
}
; ACTUAL (2 instructions) -- identical
define fastcc noundef i64 @_ori_bool_to_int(i1 noundef %0) #0 {
bb0:
  %sel = select i1 %0, i64 1, i64 0
  ret i64 %sel
}

Delta: +0 instructions. OPTIMAL.

@check_logic: Ideal vs Actual

; IDEAL (23 instructions)
; Same as actual -- constant folding of && and || is correct,
; 4 calls to bool_to_int with constant args,
; 3 overflow-checked additions, 3 panic blocks, 1 ret.
; All 23 instructions justified.

Delta: +0 instructions. OPTIMAL.

@check_strings: Ideal vs Actual

; IDEAL (58 instructions)
; String function requires:
; - 6 allocas for sret buffers (3 construction + 3 str_len)
; - 3 string constructions (ori_str_from_raw x2, ori_str_empty x1)
; - 3 aggregate loads (single `load { i64, i64, ptr }` each)
; - 3 store + ori_str_len call sequences (6 total)
; - 3 SSO-guarded RC decrements (7 instructions each = 21)
; - 3 RC dec heap blocks (call + br each = 6)
; - 2 overflow-checked additions (8 total)
; - 2 overflow panic blocks (4 total)
; - 1 ret
; Total: 6 + 3 + 3 + 6 + 21 + 6 + 8 + 4 + 1 = 58
; All instructions structurally justified.

Delta: +0 instructions. OPTIMAL.

@main: Ideal vs Actual

; IDEAL (9 instructions)
define noundef i64 @_ori_main() {
  %call = call fastcc i64 @_ori_check_logic()
  %call1 = call fastcc i64 @_ori_check_strings()
  %add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %call, i64 %call1)
  %add.val = extractvalue { i64, i1 } %add, 0
  %add.ovf = extractvalue { i64, i1 } %add, 1
  br i1 %add.ovf, label %panic, label %ok
ok:
  ret i64 %add.val
panic:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable
}

Delta: +0 instructions. OPTIMAL.

Module Summary

FunctionIdealActualDeltaJustifiedVerdict
@bool_to_int22+0N/AOPTIMAL
@check_logic2323+0N/AOPTIMAL
@check_strings5858+0N/AOPTIMAL
@main99+0N/AOPTIMAL

8. Strings: Representation and Aggregate Load Pattern

Ori strings use a 3-field representation: { i64, i64, ptr } — the OriStr fat struct:

  • Field 0 (i64): inline data / pointer to heap buffer
  • Field 1 (i64): length in bytes
  • Field 2 (ptr): heap data pointer (with SSO flag in high bit)

String literals are constructed via ori_str_from_raw(ptr sret, ptr raw, i64 len) which takes a destination sret pointer, a raw C string pointer, and the byte length. The empty string uses the specialized ori_str_empty() constructor.

The sret (struct return) pattern uses a single aggregate load { i64, i64, ptr } (1 instruction per string) rather than per-field GEP+load+insertvalue. This is valid because { i64, i64, ptr } is a first-class aggregate in LLVM IR and the sret alloca provides a properly aligned memory source.

9. Strings: SSO Guard Pattern

Each string’s RC decrement is guarded by an SSO (Small String Optimization) check:

%0 = extractvalue { i64, i64, ptr } %sret.load, 2
%1 = ptrtoint ptr %0 to i64
%2 = and i64 %1, -9223372036854775808   ; check high bit
%3 = icmp ne i64 %2, 0
%4 = icmp eq i64 %1, 0                  ; check null
%5 = or i1 %3, %4
br i1 %5, label %rc_dec.sso_skip, label %rc_dec.heap

This 7-instruction SSO guard checks two conditions: (1) high bit set = SSO string stored inline, (2) null pointer = no heap allocation. Both cases skip the ori_rc_dec call. The single ptrtoint is shared for both checks — clean and efficient.

10. Strings: Nounwind Propagation Improvement

The nounwind fixed-point analysis now correctly marks @check_strings and @_ori_main as nounwind. The trace shows the analysis computed nounwind_count=4 (all 4 user functions) in 2 passes. This is an improvement over the previous run where these functions lacked nounwind because the analysis was more conservative about functions calling ori_rc_dec.

The improvement is significant for LLVM optimization: nounwind allows LLVM to eliminate exception handling tables and enables more aggressive inlining and code motion. The ori_rc_dec declaration now carries nounwind memory(inaccessiblemem: readwrite), confirming it cannot unwind, which the fixed-point analysis correctly propagates to callers. [NOTE-2]

Findings

#SeverityCategoryDescriptionStatusFirst Seen
1NOTEAttributesPure function detection: @bool_to_int gets memory(none)CONFIRMEDJ9
2NOTEAttributesNounwind now propagates through string-handling functionsFIXEDJ9
3NOTEARCCorrect SSO-guarded conditional RC decrement for all 3 stringsCONFIRMEDJ9
4NOTECodegenExcellent constant folding of boolean && / || operatorsCONFIRMEDJ9
5NOTEAttributes100% attribute compliance across all functions (19/19)CONFIRMEDJ9
6NOTEBinaryRC leak detection integrated into main() wrapperCONFIRMEDJ9

NOTE-1: Pure function detection yields memory(none)

Location: @bool_to_int Impact: Positive — the nounwind memory(none) attribute set is ideal for a pure function, enabling maximum LLVM optimization Found in: Attributes & Calling Convention (Category 3)

NOTE-2: Nounwind propagation improvement

Location: @check_strings, @_ori_main, @main (C entry) Impact: Positive — these functions now correctly have nounwind, gained via improved fixed-point analysis that recognizes ori_rc_dec (declared with nounwind memory(inaccessiblemem: readwrite)) as non-unwinding. This eliminates unnecessary exception handling tables and enables more aggressive LLVM optimization. Previous: In the 2026-03-19 run, @check_strings and @_ori_main had attribute group #2 = { uwtable } (missing nounwind). Now they use #1 = { nounwind uwtable }. Found in: Attributes & Calling Convention (Category 3), Nounwind Propagation (Category 10)

NOTE-3: Correct SSO-guarded conditional RC decrement

Location: @check_strings, 3 SSO guard sequences Impact: Positive — correctly avoids calling ori_rc_dec on SSO/inline strings Found in: ARC Purity (Category 2)

NOTE-4: Excellent constant folding of boolean operators

Location: @check_logic Impact: Positive — true && true becomes constant true, eliminating all runtime branching for boolean logic Found in: Compiler Pipeline / Canonicalization

NOTE-5: Full attribute compliance achieved

Location: All user and runtime functions Impact: Positive — 19/19 applicable attributes correct (100%). Found in: Attributes & Calling Convention (Category 3)

NOTE-6: RC leak detection integrated into main() wrapper

Location: @main (C entry point) wrapper function Impact: Positive — the main() wrapper calls ori_check_leaks() after _ori_main() and uses a select to override the exit code if leaks are detected. Found in: Binary Analysis (Category 6)

Codegen Quality Score

CategoryWeightScoreNotes
Instruction Efficiency15%10/101.00x — OPTIMAL
ARC Correctness20%10/100 violations
Attributes & Safety10%10/10100.0% compliance
Control Flow10%10/100 defects
IR Quality20%10/100 unjustified instructions
Binary Quality10%10/100 defects
Other Findings15%10/10No uncategorized findings

Overall: 10.0 / 10

Verdict

Journey 9’s string codegen achieves a perfect score. All functions are OPTIMAL with zero unjustified instructions. The headline improvement in this re-run is nounwind propagation: @check_strings and @_ori_main now correctly carry the nounwind attribute, achieved through improved fixed-point analysis that recognizes ori_rc_dec’s nounwind declaration. This brings attribute compliance from the previous run’s partial coverage to 100% (19/19). ARC remains perfectly balanced with zero violations, and the SSO guard pattern continues to work correctly for all three string values.

Cross-Journey Observations

FeatureFirst TestedThis JourneyStatus
Overflow checkingJ1J9CONFIRMED
fastcc usageJ1J9CONFIRMED
Constant folding (booleans)J2J9CONFIRMED
nounwind propagationJ1J9IMPROVED (now propagates through string ops)
ARC string lifecycleJ9J9CONFIRMED
SSO guard patternJ9J9CONFIRMED
memory(none) on pure functionsJ9J9CONFIRMED
Full attribute complianceJ9J9CONFIRMED
RC leak detection in main()J9J9CONFIRMED
Aggregate sret loadJ9J9CONFIRMED

The most significant change in this re-run is the nounwind propagation improvement. Previously, @check_strings and @_ori_main lacked nounwind because the analysis was conservative about functions calling ori_rc_dec. The fixed-point analysis now correctly recognizes that ori_rc_dec is declared nounwind memory(inaccessiblemem: readwrite) and propagates this through the call graph. This is particularly important for string-heavy code where every function transitively calls RC operations.