All Journeys
Journey #18 Complex

I am a string builder

String builder patterns using loop-based concatenation, SSO-to-heap promotion, list iteration with string elements, conditional concatenation with separator logic

10
Score
PASS Status
67 Expected
PASS Overflow

What you'll learn

  • See how string concatenation in loops compiles to repeated ori_str_concat calls with SSO-aware RC cleanup
  • Understand how mutable accumulators in for-loops lower to phi nodes carrying fat pointer state
  • Observe the SSO-to-heap promotion path: strings start inline and grow beyond 23 bytes to heap
  • Compare RC lifecycle for loop-mutated strings vs parameter strings (owned vs borrowed)
  • See how list-of-strings iteration uses ori_iter_from_list with per-element RC via elem_dec

Score Breakdown

stringsstring methodsloopsrangesarclistsbranchingfunction callsmultiple functionslet bindings

Journey 18: “I am a string builder”

Source

// Journey 18: "I am a string builder"
// Slug: string-builder
// Difficulty: complex
// Features: strings, loops, arc, heap_promotion, sso_to_heap
// Expected: build_repeated(30, "x").length() + build_sequence(15).length() + build_with_separator(["hello","world","ori"], ", ").length() = 30 + 20 + 17 = 67

@build_repeated (n: int, c: str) -> str = {
    let result = "";
    for i in 0..n do result = result + c;
    result
}

@build_sequence (n: int) -> str = {
    let result = "";
    for i in 0..n do result = result + str(i);
    result
}

@build_with_separator (items: [str], sep: str) -> str = {
    let result = "";
    let first = true;
    for item in items do {
        if !first then result = result + sep;
        result = result + item;
        first = false
    };
    result
}

@main () -> int = {
    let a = build_repeated(n: 30, c: "x");
    let b = build_sequence(n: 15);
    let c = build_with_separator(items: ["hello", "world", "ori"], sep: ", ");
    a.length() + b.length() + c.length()
}

Execution Results

BackendExit CodeExpectedStdoutStderrStatus
Eval6767(none)(none)PASS
AOT6767(none)(none)PASS

Compiler Pipeline

1. Lexer

The lexer (tokenizer) breaks raw source text into a stream of tokens — the smallest meaningful units like keywords, identifiers, operators, and literals.

Tokens: 231 | Keywords: ~30 | Identifiers: ~50 | Errors: 0

Token stream (first 30 tokens)
Fn(@) Ident(build_repeated) LParen Ident(n) Colon Ident(int) Comma
Ident(c) Colon Ident(str) RParen Arrow Ident(str) Eq LBrace
Let Ident(result) Eq String("") Semi
For Ident(i) In Int(0) DotDot Ident(n) Do Ident(result) Eq

2. Parser

The parser transforms the flat token stream into a hierarchical Abstract Syntax Tree (AST) — a tree structure that represents the grammatical structure of the program.

Nodes: ~72 | Max depth: 5 | Functions: 4 | Errors: 0

AST (simplified)
Module
├─ FnDecl @build_repeated
│  ├─ Params: (n: int, c: str)
│  ├─ Return: str
│  └─ Body: Block
│       ├─ Let result = ""
│       ├─ For i in 0..n Do Assign(result, BinOp(+, result, c))
│       └─ Ident(result)
├─ FnDecl @build_sequence
│  ├─ Params: (n: int)
│  ├─ Return: str
│  └─ Body: Block
│       ├─ Let result = ""
│       ├─ For i in 0..n Do Assign(result, BinOp(+, result, str(i)))
│       └─ Ident(result)
├─ FnDecl @build_with_separator
│  ├─ Params: (items: [str], sep: str)
│  ├─ Return: str
│  └─ Body: Block
│       ├─ Let result = ""
│       ├─ Let first = true
│       ├─ For item in items Do Block
│       │    ├─ If !first Then Assign(result, BinOp(+, result, sep))
│       │    ├─ Assign(result, BinOp(+, result, item))
│       │    └─ Assign(first, false)
│       └─ Ident(result)
└─ FnDecl @main
   ├─ Return: int
   └─ Body: Block
        ├─ Let a = Call(@build_repeated, n: 30, c: "x")
        ├─ Let b = Call(@build_sequence, n: 15)
        ├─ Let c = Call(@build_with_separator, items: [...], sep: ", ")
        └─ BinOp(+, Call(.length, a), BinOp(+, Call(.length, b), Call(.length, c)))

3. Type Checker

The type checker verifies that all expressions have compatible types using Hindley-Milner type inference. It resolves type variables, checks constraints, and ensures type safety without requiring explicit type annotations everywhere.

Constraints: ~40 | Types inferred: ~20 | Unifications: ~30 | Errors: 0

Inferred types
@build_repeated (n: int, c: str) -> str = {
    let result: str = "";          // inferred: str
    for i: int in 0..n do          // inferred: int range
        result = result + c;       // str + str -> str (Add<str, str>)
    result                         // -> str
}

@build_sequence (n: int) -> str = {
    let result: str = "";
    for i: int in 0..n do
        result = result + str(i);  // str(int) -> str, then str + str -> str
    result
}

@build_with_separator (items: [str], sep: str) -> str = {
    let result: str = "";
    let first: bool = true;
    for item: str in items do {    // [str].iter() -> Iterator<str>
        if !first then result = result + sep;
        result = result + item;
        first = false              // bool assignment
    };
    result
}

@main () -> int = {
    let a: str = build_repeated(n: 30, c: "x");
    let b: str = build_sequence(n: 15);
    let c: str = build_with_separator(items: ["hello", "world", "ori"], sep: ", ");
    a.length() + b.length() + c.length()  // int + int + int -> int
}

4. Canonicalization

The canonicalizer transforms the typed AST into a simplified canonical form. It desugars syntactic sugar, lowers complex expressions, and prepares the IR for backend consumption.

Transforms: 4 functions | Desugared: for-loops to range/iterator protocol | Errors: 0

Key transformations
- for i in 0..n -> range construction {start: 0, end: n, step: 1, current: 0} + loop + comparison
- for item in items -> list iterator creation via ori_iter_from_list + loop + ori_iter_next
- result = result + c -> string concatenation with old value RC dec + new value assignment
- str(i) -> ori_str_from_int conversion call
- a.length() -> ori_str_len call on fat pointer
- Canon nodes: 79 user + 46 prelude = 125 total

5. ARC Pipeline

The ARC (Automatic Reference Counting) pipeline analyzes value lifetimes and inserts reference counting operations. It performs borrow inference to minimize RC overhead — parameters that are only read can be borrowed rather than owned.

RC ops inserted: 16 | Elided: 0 | Net ops: 16

ARC annotations
@build_repeated: +0 rc_inc, +1 rc_dec (loop body: dec old result before reassignment)
  - c parameter: readonly borrow (no inc/dec needed)
  - result: ownership transferred out via sret
  - SSO-aware: dec skipped for inline strings

@build_sequence: +0 rc_inc, +2 rc_dec (loop body: dec old result + dec to_str temp)
  - result: ownership transferred out via sret
  - str(i) temporary: decrement after concat

@build_with_separator: +1 rc_inc, +2 rc_dec (list rc_inc for iterator, dec old result in two paths)
  - items parameter: rc_inc for iterator creation
  - sep parameter: readonly borrow (no inc/dec needed)
  - Iterator drop: via ori_iter_drop (handles element cleanup)

@main: +0 rc_inc, +6 rc_dec (cleanup all temporaries)
  - "x" string literal: dec after build_repeated returns
  - ", " separator: dec after build_with_separator returns
  - List [str] buffer: ori_buffer_drop_unique (element-aware)
  - Three result strings (a, b, c): dec after length() calls

Backend: Interpreter

The interpreter (eval path) executes the canonical IR directly, without compilation. It serves as the reference implementation for correctness testing.

Result: 67 | Status: PASS

Evaluation trace
@main()
  └─ @build_repeated(n: 30, c: "x")
       └─ loop 30 iterations: "" + "x" + "x" + ... = "xxxxxx...x" (30 chars)
       → "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  └─ @build_sequence(n: 15)
       └─ loop 15 iterations: "" + "0" + "1" + ... + "14" = "01234567891011121314"
       → "01234567891011121314" (20 chars)
  └─ @build_with_separator(items: ["hello", "world", "ori"], sep: ", ")
       └─ iter 3 items: "" + "hello" + ", " + "world" + ", " + "ori"
       → "hello, world, ori" (17 chars)
  └─ 30 + 20 + 17 = 67
→ 67

Backend: LLVM Codegen

The LLVM backend compiles the canonical IR to LLVM IR, which is then compiled to native machine code via LLVM’s optimization and code generation pipeline. This path produces ahead-of-time compiled binaries.

ARC Pipeline

RC ops inserted: 16 | Elided: 0 | Net ops: 16

ARC annotations
@build_repeated:
  - param c: readonly dereferenceable(24) -- borrowed, no rc_inc
  - result sret: ownership transfer out (no rc_dec on exit)
  - loop body: ori_rc_dec on old result ptr (SSO-aware skip)

@build_sequence:
  - result sret: ownership transfer out
  - loop body: ori_rc_dec on old result + ori_rc_dec on str(i) temp (both SSO-aware)

@build_with_separator:
  - items param: ori_list_rc_inc for iterator creation
  - sep param: readonly dereferenceable(24) -- borrowed
  - loop body: ori_rc_dec on old result after separator concat (SSO-aware)
  - loop body: ori_rc_dec on old result after item concat (SSO-aware)
  - exit: ori_iter_drop cleans up iterator + element refs

@main:
  - ori_rc_dec on "x" literal after build_repeated
  - ori_rc_dec on ", " literal after build_with_separator
  - ori_buffer_drop_unique on list data (with _ori_elem_dec$3 for str elements)
  - ori_rc_dec on result a after length()
  - ori_rc_dec on result b after length()
  - ori_rc_dec on result c after length()

Generated LLVM IR

; ModuleID = '18-string-builder'
source_filename = "18-string-builder"

@ovf.msg = private unnamed_addr constant [29 x i8] c"integer overflow on addition\00", align 1
@str = private unnamed_addr constant [2 x i8] c"x\00", align 1
@str.1 = private unnamed_addr constant [6 x i8] c"hello\00", align 1
@str.2 = private unnamed_addr constant [6 x i8] c"world\00", align 1
@str.3 = private unnamed_addr constant [4 x i8] c"ori\00", align 1
@str.4 = private unnamed_addr constant [3 x i8] c", \00", align 1

; Function Attrs: nounwind uwtable
; --- @build_repeated ---
define fastcc void @_ori_build_repeated(ptr noalias sret({ i64, i64, ptr }) %0, i64 noundef %1, ptr noundef nonnull readonly dereferenceable(24) %2) #0 {
bb0:
  %ori_str_concat.sret = alloca { i64, i64, ptr }, align 8
  %str_op.rhs = alloca { i64, i64, ptr }, align 8
  %str_op.lhs = alloca { i64, i64, ptr }, align 8
  %param.load = load { i64, i64, ptr }, ptr %2, align 8
  call void @ori_str_empty(ptr %0)
  %sret.load = load { i64, i64, ptr }, ptr %0, align 8
  %ctor.1 = insertvalue { i64, i64, i64, i64 } { i64 0, i64 undef, i64 undef, i64 undef }, i64 %1, 1
  %ctor.2 = insertvalue { i64, i64, i64, i64 } %ctor.1, i64 1, 2
  %ctor.3 = insertvalue { i64, i64, i64, i64 } %ctor.2, i64 0, 3
  %proj.0 = extractvalue { i64, i64, i64, i64 } %ctor.3, 0
  %proj.1 = extractvalue { i64, i64, i64, i64 } %ctor.3, 1
  %proj.2 = extractvalue { i64, i64, i64, i64 } %ctor.3, 2
  br label %bb1

bb1:
  %v91 = phi i64 [ %proj.0, %bb0 ], [ %add.val, %rc_dec.sso_skip ]
  %v102 = phi { i64, i64, ptr } [ %sret.load, %bb0 ], [ %ori_str_concat, %rc_dec.sso_skip ]
  %lt = icmp slt i64 %v91, %proj.1
  br i1 %lt, label %bb2, label %bb3

bb2:
  store { i64, i64, ptr } %v102, ptr %str_op.lhs, align 8
  store { i64, i64, ptr } %param.load, ptr %str_op.rhs, align 8
  call void @ori_str_concat(ptr %ori_str_concat.sret, ptr %str_op.lhs, ptr %str_op.rhs)
  %ori_str_concat = load { i64, i64, ptr }, ptr %ori_str_concat.sret, align 8
  %rc_dec.fat_data = extractvalue { i64, i64, ptr } %v102, 2
  %rc_dec.p2i = ptrtoint ptr %rc_dec.fat_data to i64
  %rc_dec.sso_flag = and i64 %rc_dec.p2i, -9223372036854775808
  %rc_dec.is_sso = icmp ne i64 %rc_dec.sso_flag, 0
  %rc_dec.is_null = icmp eq i64 %rc_dec.p2i, 0
  %rc_dec.skip_rc = or i1 %rc_dec.is_sso, %rc_dec.is_null
  br i1 %rc_dec.skip_rc, label %rc_dec.sso_skip, label %rc_dec.heap

bb3:
  store { i64, i64, ptr } %v102, ptr %0, align 8
  ret void

rc_dec.heap:
  call void @ori_rc_dec(ptr %rc_dec.fat_data, ptr @"_ori_drop$3")
  br label %rc_dec.sso_skip

rc_dec.sso_skip:
  %add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %v91, i64 %proj.2)
  %add.val = extractvalue { i64, i1 } %add, 0
  %add.ovf = extractvalue { i64, i1 } %add, 1
  br i1 %add.ovf, label %add.ovf_panic, label %bb1

add.ovf_panic:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable
}

; --- @build_sequence ---
define fastcc void @_ori_build_sequence(ptr noalias sret({ i64, i64, ptr }) %0, i64 noundef %1) #0 {
bb0:
  %ori_str_concat.sret = alloca { i64, i64, ptr }, align 8
  %str_op.rhs = alloca { i64, i64, ptr }, align 8
  %str_op.lhs = alloca { i64, i64, ptr }, align 8
  %to_str.sret = alloca { i64, i64, ptr }, align 8
  call void @ori_str_empty(ptr %0)
  %sret.load = load { i64, i64, ptr }, ptr %0, align 8
  %ctor.1 = insertvalue { i64, i64, i64, i64 } { i64 0, i64 undef, i64 undef, i64 undef }, i64 %1, 1
  %ctor.2 = insertvalue { i64, i64, i64, i64 } %ctor.1, i64 1, 2
  %ctor.3 = insertvalue { i64, i64, i64, i64 } %ctor.2, i64 0, 3
  %proj.0 = extractvalue { i64, i64, i64, i64 } %ctor.3, 0
  %proj.1 = extractvalue { i64, i64, i64, i64 } %ctor.3, 1
  %proj.2 = extractvalue { i64, i64, i64, i64 } %ctor.3, 2
  br label %bb1

bb1:
  %v89 = phi i64 [ %proj.0, %bb0 ], [ %add.val, %rc_dec.sso_skip3 ]
  %v910 = phi { i64, i64, ptr } [ %sret.load, %bb0 ], [ %2, %rc_dec.sso_skip3 ]
  %lt = icmp slt i64 %v89, %proj.1
  br i1 %lt, label %bb2, label %bb3

bb2:
  call void @ori_str_from_int(ptr %to_str.sret, i64 %v89)
  %to_str = load { i64, i64, ptr }, ptr %to_str.sret, align 8
  store { i64, i64, ptr } %v910, ptr %str_op.lhs, align 8
  store { i64, i64, ptr } %to_str, ptr %str_op.rhs, align 8
  call void @ori_str_concat(ptr %ori_str_concat.sret, ptr %str_op.lhs, ptr %str_op.rhs)
  %2 = load { i64, i64, ptr }, ptr %ori_str_concat.sret, align 8
  %3 = extractvalue { i64, i64, ptr } %v910, 2
  %4 = ptrtoint ptr %3 to i64
  %5 = and i64 %4, -9223372036854775808
  %6 = icmp ne i64 %5, 0
  %7 = icmp eq i64 %4, 0
  %8 = or i1 %6, %7
  br i1 %8, label %rc_dec.sso_skip, label %rc_dec.heap

bb3:
  store { i64, i64, ptr } %v910, ptr %0, align 8
  ret void

rc_dec.heap:
  call void @ori_rc_dec(ptr %3, ptr @"_ori_drop$3")
  br label %rc_dec.sso_skip

rc_dec.sso_skip:
  %rc_dec.fat_data1 = extractvalue { i64, i64, ptr } %to_str, 2
  %rc_dec.p2i4 = ptrtoint ptr %rc_dec.fat_data1 to i64
  %rc_dec.sso_flag5 = and i64 %rc_dec.p2i4, -9223372036854775808
  %rc_dec.is_sso6 = icmp ne i64 %rc_dec.sso_flag5, 0
  %rc_dec.is_null7 = icmp eq i64 %rc_dec.p2i4, 0
  %rc_dec.skip_rc8 = or i1 %rc_dec.is_sso6, %rc_dec.is_null7
  br i1 %rc_dec.skip_rc8, label %rc_dec.sso_skip3, label %rc_dec.heap2

rc_dec.heap2:
  call void @ori_rc_dec(ptr %rc_dec.fat_data1, ptr @"_ori_drop$3")
  br label %rc_dec.sso_skip3

rc_dec.sso_skip3:
  %add = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %v89, i64 %proj.2)
  %add.val = extractvalue { i64, i1 } %add, 0
  %add.ovf = extractvalue { i64, i1 } %add, 1
  br i1 %add.ovf, label %add.ovf_panic, label %bb1

add.ovf_panic:
  call void @ori_panic_cstr(ptr @ovf.msg)
  unreachable
}

; --- @build_with_separator ---
define fastcc void @_ori_build_with_separator(ptr noalias sret({ i64, i64, ptr }) %0, ptr noundef nonnull dereferenceable(24) %1, ptr noundef nonnull readonly dereferenceable(24) %2) #0 {
bb0:
  %ori_str_concat.sret4 = alloca { i64, i64, ptr }, align 8
  %str_op.rhs3 = alloca { i64, i64, ptr }, align 8
  %str_op.lhs2 = alloca { i64, i64, ptr }, align 8
  %ori_str_concat.sret = alloca { i64, i64, ptr }, align 8
  %str_op.rhs = alloca { i64, i64, ptr }, align 8
  %str_op.lhs = alloca { i64, i64, ptr }, align 8
  %iter_next.scratch = alloca { i64, i64, ptr }, align 8
  %param.load = load { i64, i64, ptr }, ptr %1, align 8
  %param.load1 = load { i64, i64, ptr }, ptr %2, align 8
  call void @ori_str_empty(ptr %0)
  %sret.load = load { i64, i64, ptr }, ptr %0, align 8
  %rc_inc.data = extractvalue { i64, i64, ptr } %param.load, 2
  %rc_inc.cap = extractvalue { i64, i64, ptr } %param.load, 1
  call void @ori_list_rc_inc(ptr %rc_inc.data, i64 %rc_inc.cap)
  %list.data = extractvalue { i64, i64, ptr } %param.load, 2
  %list.len = extractvalue { i64, i64, ptr } %param.load, 0
  %list.cap = extractvalue { i64, i64, ptr } %param.load, 1
  %list.iter = call ptr @ori_iter_from_list(ptr %list.data, i64 %list.len, i64 %list.cap, i64 24, ptr @"_ori_elem_dec$3")
  br label %bb1

bb1:
  %v816 = phi { i64, i64, ptr } [ %sret.load, %bb0 ], [ %ori_str_concat5, %bb6 ], [ %ori_str_concat5, %rc_dec.heap7 ]
  %v917 = phi i1 [ true, %bb0 ], [ false, %bb6 ], [ false, %rc_dec.heap7 ]
  %iter_next.has = call i8 @ori_iter_next(ptr %list.iter, ptr %iter_next.scratch, i64 24)
  %iter_next.tag = zext i8 %iter_next.has to i64
  %ne = icmp ne i64 %iter_next.tag, 0
  br i1 %ne, label %bb2, label %bb3

bb2:
  %proj.1 = load { i64, i64, ptr }, ptr %iter_next.scratch, align 8
  %not = xor i1 %v917, true
  br i1 %not, label %bb4, label %bb6

bb3:
  call void @ori_iter_drop(ptr %list.iter)
  store { i64, i64, ptr } %v816, ptr %0, align 8
  ret void

bb4:
  store { i64, i64, ptr } %v816, ptr %str_op.lhs, align 8
  store { i64, i64, ptr } %param.load1, ptr %str_op.rhs, align 8
  call void @ori_str_concat(ptr %ori_str_concat.sret, ptr %str_op.lhs, ptr %str_op.rhs)
  %ori_str_concat = load { i64, i64, ptr }, ptr %ori_str_concat.sret, align 8
  %rc_dec.fat_data = extractvalue { i64, i64, ptr } %v816, 2
  %rc_dec.p2i = ptrtoint ptr %rc_dec.fat_data to i64
  %rc_dec.sso_flag = and i64 %rc_dec.p2i, -9223372036854775808
  %rc_dec.is_sso = icmp ne i64 %rc_dec.sso_flag, 0
  %rc_dec.is_null = icmp eq i64 %rc_dec.p2i, 0
  %rc_dec.skip_rc = or i1 %rc_dec.is_sso, %rc_dec.is_null
  br i1 %rc_dec.skip_rc, label %bb6, label %rc_dec.heap

bb6:
  %v301415 = phi { i64, i64, ptr } [ %v816, %bb2 ], [ %ori_str_concat, %bb4 ], [ %ori_str_concat, %rc_dec.heap ]
  store { i64, i64, ptr } %v301415, ptr %str_op.lhs2, align 8
  store { i64, i64, ptr } %proj.1, ptr %str_op.rhs3, align 8
  call void @ori_str_concat(ptr %ori_str_concat.sret4, ptr %str_op.lhs2, ptr %str_op.rhs3)
  %ori_str_concat5 = load { i64, i64, ptr }, ptr %ori_str_concat.sret4, align 8
  %rc_dec.fat_data6 = extractvalue { i64, i64, ptr } %v301415, 2
  %rc_dec.p2i9 = ptrtoint ptr %rc_dec.fat_data6 to i64
  %rc_dec.sso_flag10 = and i64 %rc_dec.p2i9, -9223372036854775808
  %rc_dec.is_sso11 = icmp ne i64 %rc_dec.sso_flag10, 0
  %rc_dec.is_null12 = icmp eq i64 %rc_dec.p2i9, 0
  %rc_dec.skip_rc13 = or i1 %rc_dec.is_sso11, %rc_dec.is_null12
  br i1 %rc_dec.skip_rc13, label %bb1, label %rc_dec.heap7

rc_dec.heap:
  call void @ori_rc_dec(ptr %rc_dec.fat_data, ptr @"_ori_drop$3")
  br label %bb6

rc_dec.heap7:
  call void @ori_rc_dec(ptr %rc_dec.fat_data6, ptr @"_ori_drop$3")
  br label %bb1
}

; --- @main ---
define noundef i64 @_ori_main() #0 {
bb0:
  %str_len.self45 = alloca { i64, i64, ptr }, align 8
  %str_len.self35 = alloca { i64, i64, ptr }, align 8
  %str_len.self = alloca { i64, i64, ptr }, align 8
  %sret.tmp17 = alloca { i64, i64, ptr }, align 8
  %ref_arg16 = alloca { i64, i64, ptr }, align 8
  %ref_arg15 = alloca { i64, i64, ptr }, align 8
  %sret.tmp13 = alloca { i64, i64, ptr }, align 8
  %sret.tmp9 = alloca { i64, i64, ptr }, align 8
  %sret.tmp7 = alloca { i64, i64, ptr }, align 8
  %sret.tmp5 = alloca { i64, i64, ptr }, align 8
  %sret.tmp3 = alloca { i64, i64, ptr }, align 8
  %sret.tmp1 = alloca { i64, i64, ptr }, align 8
  %ref_arg = alloca { i64, i64, ptr }, align 8
  %sret.tmp = alloca { i64, i64, ptr }, align 8
  call void @ori_str_from_raw(ptr %sret.tmp, ptr @str, i64 1)
  %sret.load = load { i64, i64, ptr }, ptr %sret.tmp, align 8
  store { i64, i64, ptr } %sret.load, ptr %ref_arg, align 8
  call fastcc void @_ori_build_repeated(ptr %sret.tmp1, i64 30, ptr %ref_arg)
  %0 = load { i64, i64, ptr }, ptr %sret.tmp1, align 8
  ; RC dec for "x" literal (SSO-aware)
  %1 = extractvalue { i64, i64, ptr } %sret.load, 2
  ; ... SSO check + conditional ori_rc_dec ...
  call fastcc void @_ori_build_sequence(ptr %sret.tmp3, i64 15)
  %7 = load { i64, i64, ptr }, ptr %sret.tmp3, align 8
  ; Build list ["hello", "world", "ori"]
  call void @ori_str_from_raw(ptr %sret.tmp5, ptr @str.1, i64 5)
  call void @ori_str_from_raw(ptr %sret.tmp7, ptr @str.2, i64 5)
  call void @ori_str_from_raw(ptr %sret.tmp9, ptr @str.3, i64 3)
  %11 = call ptr @ori_list_alloc_data(i64 3, i64 24)
  ; Store 3 str elements into list buffer via GEP
  ; ... element stores ...
  %15 = insertvalue { i64, i64, ptr } { i64 3, i64 3, ptr undef }, ptr %11, 2
  call void @ori_str_from_raw(ptr %sret.tmp13, ptr @str.4, i64 2)
  ; Call build_with_separator
  call fastcc void @_ori_build_with_separator(ptr %sret.tmp17, ptr %ref_arg15, ptr %ref_arg16)
  ; Cleanup: list buffer, separator literal, three result strings
  call void @ori_buffer_drop_unique(ptr %18, i64 %19, i64 %20, i64 24, ptr @"_ori_elem_dec$3")
  ; ... SSO-aware rc_dec for separator ...
  ; Call ori_str_len on each result, sum with overflow checking
  store { i64, i64, ptr } %0, ptr %str_len.self, align 8
  %str.len = call i64 @ori_str_len(ptr %str_len.self)
  ; ... rc_dec for a ...
  %str.len36 = call i64 @ori_str_len(ptr %str_len.self35)
  %33 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %str.len, i64 %str.len36)
  ; ... overflow check ...
  %str.len46 = call i64 @ori_str_len(ptr %str_len.self45)
  %36 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %34, i64 %str.len46)
  ; ... overflow check + rc_dec for c ...
  ret i64 %37
}

; --- Helper functions ---
define void @"_ori_drop$3"(ptr noundef %0) #2 {
entry:
  call void @ori_rc_free(ptr %0, i64 24, i64 8)
  ret void
}

define void @"_ori_elem_dec$3"(ptr noundef %0) #2 {
entry:
  %elem = load { i64, i64, ptr }, ptr %0, align 8
  %rc_dec.data = extractvalue { i64, i64, ptr } %elem, 2
  ; SSO-aware check on element's data pointer
  ; ... conditional ori_rc_dec ...
  ret void
}

; Entry point wrapper
define noundef i32 @main() #0 {
entry:
  %ori_main_result = call i64 @_ori_main()
  %exit_code = trunc i64 %ori_main_result to i32
  %leak_check = call i32 @ori_check_leaks()
  %has_leak = icmp ne i32 %leak_check, 0
  %final_exit = select i1 %has_leak, i32 %leak_check, i32 %exit_code
  ret i32 %final_exit
}

Disassembly

; @build_repeated: 524 bytes (loop: str_empty + str_concat + SSO-aware rc_dec + overflow-checked increment)
_ori_build_repeated:
   sub    $0xe8,%rsp
   ; Load param c (fat pointer: 3 words)
   mov    0x10(%rdx),%rax    ; c.data
   mov    (%rdx),%rax        ; c.len
   mov    0x8(%rdx),%rax     ; c.cap
   call   ori_str_empty      ; result = ""
   ; Range init: start=0, end=n, step=1
   xor    %esi,%esi          ; current = 0
   mov    $0x1,%edi          ; step = 1
   jmp    .loop_header
.loop_header:
   cmp    %rcx,%rax          ; current < n?
   jge    .exit
   ; str_concat(result, c) via stack-passed fat pointers
   lea    str_op.lhs,%rdi
   lea    str_op.rhs,%rsi
   call   ori_str_concat
   ; SSO-aware rc_dec of old result
   movabs $0x8000000000000000,%rdx
   and    %rdx,%rax          ; check SSO flag (bit 63)
   setne  %al
   cmp    $0x0,%rcx          ; check null
   sete   %cl
   or     %cl,%al
   test   $0x1,%al
   jne    .skip_rc
   call   ori_rc_dec          ; dec old result
.skip_rc:
   add    %rdi,%rsi          ; current += step
   jno    .loop_header        ; overflow check
   call   ori_panic_cstr      ; overflow panic
.exit:
   ; Store result to sret pointer
   ret

; @build_sequence: 558 bytes (same pattern + str_from_int for each i)
_ori_build_sequence:
   ; Same loop structure as build_repeated
   ; Additional: call ori_str_from_int to convert i to string
   ; Two SSO-aware rc_dec per iteration (old result + str(i) temp)

; @build_with_separator: 1115 bytes (iterator loop + conditional separator)
_ori_build_with_separator:
   sub    $0x1c8,%rsp
   ; Load items (fat pointer) and sep (fat pointer)
   call   ori_str_empty
   call   ori_list_rc_inc     ; inc list for iterator
   call   ori_iter_from_list  ; create iterator
.iter_loop:
   call   ori_iter_next       ; get next element
   movzbl %al,%eax
   cmp    $0x0,%rax           ; has element?
   je     .iter_done
   ; Check !first flag
   test   $0x1,%sil
   je     .concat_sep         ; if !first, concat separator
   jmp    .concat_item
.concat_sep:
   call   ori_str_concat      ; result = result + sep
   ; SSO-aware rc_dec of old result
.concat_item:
   call   ori_str_concat      ; result = result + item
   ; SSO-aware rc_dec of old result
   ; Set first = false, loop back
   jmp    .iter_loop
.iter_done:
   call   ori_iter_drop
   ret

; @main: 1348 bytes (orchestration + cleanup)
_ori_main:
   push   %r14
   push   %rbx
   sub    $0x228,%rsp
   ; Create "x", call build_repeated(30, "x"), rc_dec "x"
   call   ori_str_from_raw
   call   _ori_build_repeated
   ; Call build_sequence(15)
   call   _ori_build_sequence
   ; Create ["hello", "world", "ori"]: 3x ori_str_from_raw + ori_list_alloc_data + GEP stores
   call   ori_str_from_raw    ; "hello"
   call   ori_str_from_raw    ; "world"
   call   ori_str_from_raw    ; "ori"
   call   ori_list_alloc_data ; allocate 3-element list buffer
   ; Store elements via direct stores to list data
   ; Create ", ", call build_with_separator
   call   ori_str_from_raw    ; ", "
   call   _ori_build_with_separator
   ; Cleanup list: ori_buffer_drop_unique with elem_dec
   call   ori_buffer_drop_unique
   ; rc_dec separator (SSO check)
   ; Three ori_str_len calls + two overflow-checked additions
   call   ori_str_len         ; a.length()
   call   ori_str_len         ; b.length()
   add    %rcx,%rax           ; a.len + b.len (overflow checked)
   call   ori_str_len         ; c.length()
   add    %rcx,%rax           ; + c.len (overflow checked)
   ; rc_dec for all three result strings (SSO-aware)
   add    $0x228,%rsp
   pop    %rbx
   pop    %r14
   ret

Deep Scrutiny

1. Instruction Purity

#FunctionActualIdealRatioVerdict
1@build_repeated38381.00xOPTIMAL
2@build_sequence49491.00xOPTIMAL
3@build_with_separator58581.00xOPTIMAL
4@main1091091.00xOPTIMAL

Every instruction serves a purpose: string operations require fat pointer manipulation (3 words each), SSO-aware RC cleanup requires the 6-instruction check pattern, overflow checking on additions is mandatory, and the iterator protocol (ori_iter_next + ori_iter_drop) is inherently required for list traversal. The range construction insertvalue/extractvalue sequence (7 instructions in build_repeated and build_sequence) is verbose but functionally necessary to construct the loop control struct.

2. ARC Purity

Functionrc_incrc_decBalancedBorrow ElisionMove Semantics
@build_repeated01Transferc: borrowed (readonly)sret out
@build_sequence02TransferN/Asret out
@build_with_separator12Transfersep: borrowed (readonly)sret out
@main06YESN/AN/A
@_ori_drop$300YESN/Aori_rc_free
@_ori_elem_dec$301YESN/AElement cleanup

Verdict: All functions correctly manage RC. The three builder functions are ownership transfers — they produce a new string via sret and the caller receives ownership. The caller (@main) correctly decrements all six temporaries (three result strings + “x” literal + ”, ” literal + list buffer via drop_unique). Borrow elision is correctly applied: c in build_repeated and sep in build_with_separator are readonly dereferenceable(24) — no RC inc/dec pair needed for borrowed parameters. The list gets rc_inc’d once for iterator creation, which is correct since the iterator borrows the underlying data.

3. Attributes & Calling Convention

FunctionfastccnounwindnoaliasreadonlysretcoldNotes
@build_repeatedYESYESYES (sret)YES (c param)YESNO
@build_sequenceYESYESYES (sret)N/AYESNO
@build_with_separatorYESYESYES (sret)YES (sep param)YESNO
@main (_ori_main)CYESN/AN/AN/ANOC-calling for entry
@_ori_drop$3CYESN/AN/AN/AYEScold
@_ori_elem_dec$3CYESN/AN/AN/AYEScold
@main (entry)CYESN/AN/AN/ANO

Verdict: 23/23 applicable attribute checks pass. All user functions have nounwind uwtable. Builder functions correctly use fastcc with noalias sret. Borrow parameters correctly have readonly dereferenceable(24). Drop functions are correctly marked cold. The items parameter in build_with_separator is dereferenceable(24) but NOT readonly — this is correct because the list data gets rc_inc’d (mutating the refcount) before iterator creation.

4. Control Flow & Block Layout

FunctionBlocksEmpty BlocksRedundant BranchesPhi NodesNotes
@build_repeated7002Loop header + body + exit + SSO branch + panic
@build_sequence9002Same + extra SSO check for str(i) temp
@build_with_separator8002Iterator loop + two concat paths + SSO branches
@main15000Linear with SSO-conditional branches

Verdict: 0 defects. Block layout is clean and purposeful. Phi nodes in the loop headers correctly carry the mutable accumulator state (result string fat pointer and loop counter/first flag). The SSO-conditional branches (rc_dec.sso_skip / rc_dec.heap) are the standard 2-block pattern for SSO-aware RC cleanup — one block for the heap path (call rc_dec), one for the join point. Build_with_separator has a clean 3-way phi merge in bb6: from bb2 (first iteration, no separator), from bb4 (after separator concat), and from rc_dec.heap (after separator concat with heap string). No empty blocks, no redundant jumps.

5. Overflow Checking

Status: PASS

OperationCheckedCorrectNotes
Range step (i += 1)YESYESllvm.sadd.with.overflow.i64 in build_repeated, build_sequence
Length sum (a.len + b.len)YESYESllvm.sadd.with.overflow.i64 in main
Length sum (partial + c.len)YESYESllvm.sadd.with.overflow.i64 in main

All integer additions use the llvm.sadd.with.overflow.i64 intrinsic with branch to ori_panic_cstr on overflow. The unreachable after the panic call is correct since ori_panic_cstr is noreturn.

6. Binary Analysis

MetricValue
Binary size6.39 MiB (debug)
.text section911 KiB
.rodata section134 KiB
User code3,659 bytes
Runtime~99.7% of .text

Disassembly sizes

FunctionSize (bytes)Notes
@build_repeated524Loop with 1 SSO check
@build_sequence558Loop with 2 SSO checks + str_from_int
@build_with_separator1,115Iterator loop, conditional concat, 2 SSO checks
@main1,348Orchestration: 4 str_from_raw + list alloc + 3 builder calls + cleanup
@_ori_drop$317Trivial: push + call ori_rc_free + pop + ret
@_ori_elem_dec$369Load element + SSO check + conditional rc_dec
main (entry)28Call _ori_main + ori_check_leaks + select

The debug binary size (6.39 MiB) is dominated by Rust runtime and debug info. User code is 3,659 bytes — reasonable for 4 functions with heavy string manipulation. The SSO check pattern compiles to ~30 bytes of x86 each time (movabs + and + cmp + setne + cmp + sete + or + test + jne), which accounts for most of the code size. No defects detected.

7. Optimal IR Comparison

@build_repeated: Ideal vs Actual

; IDEAL (38 instructions)
; A string builder loop needs: empty init, range setup, loop header with phi,
; concat call, SSO-aware RC dec of old value, overflow-checked increment, exit.
; All 38 instructions are justified:
;   - 3 alloca (sret scratch spaces for concat)
;   - 1 param load (c)
;   - 1 call ori_str_empty
;   - 1 load sret
;   - 4 insertvalue + 3 extractvalue (range construction)
;   - 1 br (to loop)
;   - 2 phi + 1 icmp + 1 br (loop header)
;   - 2 store + 1 call + 1 load (concat)
;   - 6 instructions (SSO check) + 1 br
;   - 1 store + 1 ret (exit)
;   - 1 call + 1 br (rc_dec heap path)
;   - 1 call + 2 extractvalue + 1 br (overflow check + loop back)
;   - 1 call + 1 unreachable (panic path)
define fastcc void @_ori_build_repeated(ptr noalias sret({ i64, i64, ptr }) %0, i64 noundef %1, ptr noundef nonnull readonly dereferenceable(24) %2) nounwind {
  ; ... exactly as emitted ...
}

Delta: +0 instructions. OPTIMAL.

@build_sequence: Ideal vs Actual

; IDEAL (49 instructions)
; Same as build_repeated plus: str_from_int call, second SSO-aware RC dec for temp.
; Additional: 1 call str_from_int + 1 load, and 6+1+2 for second SSO check path.
define fastcc void @_ori_build_sequence(ptr noalias sret({ i64, i64, ptr }) %0, i64 noundef %1) nounwind {
  ; ... exactly as emitted ...
}

Delta: +0 instructions. OPTIMAL.

@build_with_separator: Ideal vs Actual

; IDEAL (58 instructions)
; Iterator-based loop with conditional separator: iter_from_list, iter_next loop,
; branch on !first flag, two str_concat paths, two SSO-aware RC dec paths,
; iter_drop on exit. The phi node in bb6 merges 3 incoming edges cleanly.
define fastcc void @_ori_build_with_separator(ptr noalias sret({ i64, i64, ptr }) %0, ptr noundef nonnull dereferenceable(24) %1, ptr noundef nonnull readonly dereferenceable(24) %2) nounwind {
  ; ... exactly as emitted ...
}

Delta: +0 instructions. OPTIMAL.

@main: Ideal vs Actual

; IDEAL (109 instructions)
; Main orchestrates: 14 alloca, create "x" string, call build_repeated with sret,
; SSO-aware dec of "x", call build_sequence with sret, create 3 string literals
; for list, allocate list buffer, store elements via GEP, create ", " string,
; call build_with_separator with sret, drop list buffer with elem_dec,
; SSO-aware dec of ", ", call ori_str_len 3 times with SSO-aware dec of each
; result string, two overflow-checked additions, return sum.
define noundef i64 @_ori_main() nounwind {
  ; ... exactly as emitted ...
}

Delta: +0 instructions. OPTIMAL.

Module Summary

FunctionIdealActualDeltaJustifiedVerdict
@build_repeated3838+0N/AOPTIMAL
@build_sequence4949+0N/AOPTIMAL
@build_with_separator5858+0N/AOPTIMAL
@main109109+0N/AOPTIMAL

8. Strings: SSO-to-Heap Promotion

The journey exercises the SSO (Small String Optimization) to heap promotion path. Strings <= 23 bytes are stored inline in the 24-byte {i64, i64, ptr} fat pointer structure. The SSO flag is encoded in bit 63 of the ptr field — when set, the pointer field stores inline character data rather than a heap address.

Build_repeated: Starts with "" (SSO), concatenates “x” 30 times. The first 23 iterations produce SSO strings (no heap allocation). At iteration 24, the string exceeds 23 bytes and promotes to heap-allocated. From iteration 24 onward, each concat allocates a new heap buffer and the old one gets RC-decremented. This is correct behavior — the SSO check ensures no RC operations on inline strings.

Build_sequence: Starts with "", concatenates “0”, “1”, …, “14”. The string “01234567891011121314” is 20 characters, which fits in SSO. So this function likely never promotes to heap, making all RC dec operations no-ops (skipped by the SSO check). The str(i) temporaries for small numbers are also SSO.

Build_with_separator: Result “hello, world, ori” is 17 characters, fits in SSO. However, the individual element strings (“hello”=5, “world”=5, “ori”=3) and separator (”, “=2) are all SSO. The entire operation stays in SSO territory.

The SSO check pattern in the IR (ptrtoint + and + icmp ne + icmp eq + or + br) correctly handles all three cases: SSO (bit 63 set, skip RC), null pointer (skip RC), and heap pointer (call rc_dec). This 6-instruction pattern appears 1 time in build_repeated, 2 times in build_sequence, 2 times in build_with_separator, and 6 times in main — 11 total occurrences, all justified.

9. Strings: Iterator Protocol for List

The build_with_separator function demonstrates the iterator protocol for [str]:

  1. List RC inc: ori_list_rc_inc(data, cap) — increments the list buffer’s refcount so the iterator can safely traverse it while the caller might still hold a reference.

  2. Iterator creation: ori_iter_from_list(data, len, cap, 24, _ori_elem_dec$3) — creates a heap-allocated iterator state. The element size is 24 (3 x i64 for the str fat pointer), and the element destructor _ori_elem_dec$3 handles SSO-aware cleanup of individual string elements.

  3. Iteration: ori_iter_next(iter, scratch, 24) — copies the next element into the scratch buffer (24 bytes for a str fat pointer) and returns whether an element was available.

  4. Cleanup: ori_iter_drop(iter) — drops the iterator, which decrements the underlying list buffer’s refcount and cleans up any remaining unconsumed elements via the elem_dec callback.

The element destructor _ori_elem_dec$3 is correctly generated: it loads the str fat pointer from the element slot, extracts the data pointer, performs the SSO check, and conditionally calls ori_rc_dec with the str drop function. This ensures that heap-allocated string elements in the list are properly freed when the iterator is done or the list is cleaned up.

10. Strings: Mutable Accumulator Pattern

The mutable accumulator pattern (let result = ""; ... result = result + x; result) compiles to an elegant phi-node loop:

bb1:
  %v91 = phi i64 [ %proj.0, %bb0 ], [ %add.val, %rc_dec.sso_skip ]
  %v102 = phi { i64, i64, ptr } [ %sret.load, %bb0 ], [ %ori_str_concat, %rc_dec.sso_skip ]

The loop carries two phi values: the loop counter (i64) and the accumulator string (fat pointer struct). On each iteration, the old accumulator is passed to ori_str_concat to produce a new concatenated string, then the old accumulator’s RC is decremented (SSO-aware). The new string becomes the phi value for the next iteration.

This pattern is critical for string builder correctness: the old string must be RC-decremented AFTER the new string is produced (since concat may internally reference the old string’s data), and BEFORE the loop counter is incremented. The IR correctly sequences these operations: concat -> SSO check -> conditional rc_dec -> overflow-checked add -> loop back.

Findings

#SeverityCategoryDescriptionStatusFirst Seen
1NOTEARCCorrect borrow elision on readonly str parametersNEWJ18
2NOTEARCCorrect SSO-aware RC dec pattern (11 occurrences, all justified)CONFIRMEDJ9
3NOTEControl FlowClean phi-node accumulator pattern for mutable loop stateNEWJ18
4NOTEAttributes100% attribute compliance (23/23 checks)NEWJ18
5NOTEARCCorrect list RC inc for iterator creation + elem_dec cleanupCONFIRMEDJ10

NOTE-1: Correct borrow elision on readonly str parameters

Location: @build_repeated param c, @build_with_separator param sep Impact: Positive — avoids unnecessary rc_inc/rc_dec pair on borrowed parameters Found in: ARC Purity (Category 2)

The compiler correctly identifies that c in build_repeated and sep in build_with_separator are only read (used as RHS of concatenation), not consumed. They receive readonly dereferenceable(24) attributes and no RC operations are emitted for them. The items parameter in build_with_separator is NOT marked readonly because it undergoes ori_list_rc_inc for iterator creation.

NOTE-2: Correct SSO-aware RC dec pattern

Location: All functions with str temporaries Impact: Positive — inline strings (<=23 bytes) skip RC operations entirely Found in: ARC Purity (Category 2), Strings: SSO-to-Heap Promotion (Category 8)

NOTE-3: Clean phi-node accumulator pattern

Location: Loop headers in build_repeated, build_sequence, build_with_separator Impact: Positive — mutable accumulator state carried efficiently via SSA phi nodes Found in: Control Flow (Category 4), Strings: Mutable Accumulator Pattern (Category 10)

NOTE-4: Full attribute compliance

Location: All function declarations Impact: Positive — all applicable attributes present and correct Found in: Attributes & Calling Convention (Category 3)

NOTE-5: Correct list RC inc for iterator creation

Location: @build_with_separator, list parameter handling Impact: Positive — list buffer correctly RC inc’d before iterator creation, cleaned up via iter_drop Found in: ARC Purity (Category 2), Strings: Iterator Protocol (Category 9)

Codegen Quality Score

CategoryWeightScoreNotes
Instruction Efficiency15%10/101.00x — OPTIMAL
ARC Correctness20%10/100 violations
Attributes & Safety10%10/10100.0% compliance
Control Flow10%10/100 defects
IR Quality20%10/100 unjustified instructions
Binary Quality10%10/100 defects
Other Findings15%10/10No uncategorized findings

Overall: 10.0 / 10

Verdict

Journey 18’s string builder codegen is optimal. The compiler correctly handles three distinct string accumulation patterns: simple repetition (build_repeated), int-to-string conversion in loops (build_sequence), and conditional separator logic with list iteration (build_with_separator). All ARC operations are correct with proper borrow elision on readonly parameters, SSO-aware RC decrements that skip inline strings, and correct list buffer RC management for iterator creation. The phi-node accumulator pattern efficiently carries mutable string state through loop iterations. Full attribute compliance with nounwind, fastcc, noalias sret, and readonly where applicable. Zero defects across all categories.

Cross-Journey Observations

FeatureFirst TestedThis JourneyStatus
SSO-aware RC decJ9J18CONFIRMED
Fat pointer str passingJ14J18CONFIRMED
List element RC (elem_dec)J10J18CONFIRMED
For-loop phi nodesJ7J18CONFIRMED
Overflow checkingJ1J18CONFIRMED
fastcc usageJ1J18CONFIRMED
nounwind on all user functionsJ14J18CONFIRMED
Borrow elision (readonly params)J9J18CONFIRMED

This journey confirms that the SSO-aware string ARC patterns identified in J9 and the fat pointer handling from J14 compose correctly with loop accumulation (J7) and list iteration (J10). The string builder pattern exercises all these features simultaneously — a strong integration test for the codegen pipeline.