Section 06: Struct & Param Codegen
Context: J4 showed that area(r: Rect) loads all 4 fields of Rect (including nested Point.x and Point.y) just to access width and height — 17 instructions instead of 4. J10 showed that the iterator loop builds a { tag, value } tuple every iteration just to immediately destructure it.
06.1 Fix M6 — Lazy Struct Load for Partial Field Access
Journey: J4 (confirmed J4, J10) | Severity: MEDIUM
File(s): compiler/ori_llvm/src/codegen/function_compiler/ (load_indirect_param)
The load_indirect_param pattern always loads the entire struct into an SSA aggregate, then uses extractvalue to access fields. It doesn’t know which fields will be accessed.
Fix approach: Load fields on demand. When a field access (extractvalue) is encountered, emit a GEP+load for just that field from the pointer, not the entire struct.
Trade-off: This requires changing from “load once, extract many” to “GEP+load per access.” For functions that access ALL fields, this is slightly worse (more GEP instructions). For functions that access few fields of large structs, it’s much better.
- Identify the
load_indirect_paramimplementation - Option A: Keep current approach but note it as acceptable (LLVM optimizes away unused loads) — CONFIRMED
-
Option B: Load fields lazily — emit GEP+load at each(not needed)extractvaluesite - Evaluate: does LLVM’s dead load elimination already handle this? — YES: O2 eliminates unused field loads entirely
- If LLVM handles it: mark as LOW priority — CONFIRMED: no codegen change needed
06.2 Fix M13 — Eliminate Unnecessary Option Tuple in Iterator
Journey: J10 | Severity: MEDIUM
File(s): compiler/ori_llvm/src/codegen/ (for..in codegen)
The iterator loop builds an Option-like { i64, i64 } tuple on every iteration:
; Current — builds tuple, then immediately destructures:
%iter_next.has = call i8 @ori_iter_next(ptr %iter, ptr %scratch, i64 8)
%iter_next.tag = zext i8 %iter_next.has to i64
%iter_next.elem = load i64, ptr %scratch, align 4
%iter_next.0 = insertvalue { i64, i64 } undef, i64 %iter_next.tag, 0
%iter_next.1 = insertvalue { i64, i64 } %iter_next.0, i64 %iter_next.elem, 1
%proj.0 = extractvalue { i64, i64 } %iter_next.1, 0 ; check tag
%ne = icmp ne i64 %proj.0, 0
; ... later:
%proj.1 = extractvalue { i64, i64 } %iter_next.1, 1 ; get element
Target:
; Direct — no intermediate tuple:
%has_next = call i8 @ori_iter_next(ptr %iter, ptr %scratch, i64 8)
%ne = icmp ne i8 %has_next, 0
; ... in loop body:
%elem = load i64, ptr %scratch, align 8 ; load only when needed
- Find where the iterator Option tuple is constructed in codegen —
arc_emitter/builtins/iterator.rs:emit_iter_next() -
Replace with direct(not needed — LLVM handles it)i8check + deferred element load - Verify: LLVM O2 eliminates the
insertvalue/extractvalueround-trip — CONFIRMED: O2 produces ideal IRi8compared directly (no zext to i64)- element
loadsunk into loop body (only when has_next) - No intermediate tuple materialized
- Verify: Iterator still works correctly (same total, same iteration count) — CONFIRMED via test suite
06.3 Completion Checklist
- Struct field access approach decided — keep current approach, LLVM O2 eliminates unused loads
- Iterator loop tuple — LLVM O2 eliminates insertvalue/extractvalue round-trip, sinks element load into body
-
./test-all.shgreen (no codegen changes needed — both deferred to LLVM passes) - Journey 4 and Journey 10 produce correct results at O2
Exit Criteria: LLVM O2 produces ideal IR for both patterns — confirmed via opt-21 -O2 -S analysis.