Section 06: Expand Fixtures + Self-Test
Status: Not Started
Goal: The diagnostic toolkit’s self-test suite runs against only 3 basic fixtures (simple.ori, clean.ori, chain.ori). These don’t exercise closures, iterators, nested structures, generics, trait dispatch, or failure modes — the exact code patterns that cause the most debugging churn. New fixtures ensure diagnostic scripts produce correct output for the patterns they’ll actually be used to debug. The fixture suite must also cover escape closures, ? unwinding, recursive tree walks, COW sharing, large aggregates, and mixed sum types — all identified as blind spots by tp-help consensus.
Success Criteria:
- At least 13 new
.orifixture files indiagnostics/fixtures/— 14 new fixtures created - Each fixture exercises a distinct code pattern relevant to AOT/AIMS debugging
- Fixtures categorized as pass (exit 0, clean RC), aims-heavy (exit 0, exercises AIMS-specific paths like COW/reuse), or expected-fail (exit non-zero, validates diagnostic detection)
-
self-test.shruns new fixtures throughdiagnose-aot.sh,dual-exec-debug.sh,rc-stats.sh,ir-dump.sh,arc-dump.sh, andbisect-passes.sh(added by Section 05) -
bisect-passes.shexercised on at minimumclosure.ori,iterator_break.ori, andgeneric_mono.ori(the AIMS-relevant fixtures) — all 13 pass/aims-heavy fixtures run through bisect-passes - Self-test assertions are feature-specific — not just “non-empty output” but assertions on expected IR markers (e.g.,
PartialApplyfor closures,Switchfor match,RcInc/RcDecfor RC-heavy fixtures) - Expected-fail fixtures use
run_test_expect_failwith explicit exit code assertions distinguishing leak vs crash vs mismatch - All fixtures verified under both debug and release builds (
cargo bandcargo b --release) - Satisfies mission criterion: “7+ new diagnostic fixtures covering closures, iterators, nested structures, generics, trait dispatch, and failure modes”
Context: The current 3 fixtures (simple.ori — no collections/RC; clean.ori — collections, balanced RC; chain.ori — chained COW) were adequate when the toolkit was first built. But ARC/AIMS bugs predominantly appear in closure captures, iterator early-exit cleanup, nested aggregate drops, generic instantiation, and trait method dispatch — none of which are exercised. A diagnostic regression in these areas ships behind a green self-test.
Depends on: Section 05 (bisect-passes.sh must exist for self-test integration).
README ownership: Section 07 owns the diagnostics/README.md fixtures table update (see section-07-integration.md 07.4). This section creates the fixtures and the FIXTURES.md categorization file; Section 07 integrates the final table into the user-facing README.
06.1 Create core-pattern fixtures
File(s): diagnostics/fixtures/*.ori (new files)
Each fixture must: (1) compile under AOT, (2) produce deterministic output via exit code (0 = success, 1 = logic failure), (3) exercise a specific code pattern, (4) pass both ori run and AOT binary execution with identical results. Fixture names are descriptive of the pattern, not the section number. Reference existing test files in tests/valgrind/fat_matrix/ for correct Ori syntax patterns.
Category: pass — all exit 0, balanced RC.
-
closure.ori— Closure capturing a collection ([int]), calling the closure, verifying captures are alive after the call. Tests closure RC: the captured value must be inc’d on capture, dec’d on closure drop. Must also include: closure passed as function argument, closure called twice (RC balance after multiple invocations). Reference syntax:tests/valgrind/fat_matrix/f04_closure_capture.ori -
closure_escape.ori— Closures that escape their creation scope: stored in a list, passed as a parameter to another function, returned from a function, and called after the creating scope has exited. This is a GAP identified by tp-help — capture-only coverage is insufficient for RC correctness because escaping closures stress the lifetime of captured values beyond lexical scope. Reference syntax:tests/valgrind/fat_matrix/f04_closure_capture.ori(for capture patterns),tests/spec/expressions/lambdas.ori(for lambda syntax) -
iterator_break.ori— Iterate over[str]with earlybreak, verifying the iterator and remaining elements are properly dropped. This is the #1 ARC debugging pain point. Must include: full iteration (no break), break on first element, break on middle element,continueskipping elements. Reference syntax:tests/valgrind/fat_matrix/f19_break_continue.ori -
iterator_complex.ori— Iterator patterns beyond simple break: nestedforloops with fat values in both levels,for...yieldwith break producing partial collection,continuewith guard filtering, map iteration and cleanup. tp-help identified singleiterator_break.orias insufficient — iterator coverage must be deeper. Reference syntax:tests/valgrind/fat_matrix/f19_break_continue.ori,tests/spec/traits/iterator/for_loop.ori -
nested_list.ori— Nested[[str]]collection, exercisingelem_dec_fnpropagation for nested drops. Include: creating nested lists, accessing inner elements, passing nested lists to functions. Reference syntax:tests/valgrind/fat_matrix/f14_list_element.ori -
trait_dispatch.ori— Trait method call through a concreteimpl Trait for Type(current compiler syntax), testing that trait dispatch codegen produces balanced RC. Include: trait with required method, trait with default method, calling trait method on a value that owns fat pointers. Note: current compiler usesimpl Trait for Typesyntax (notimpl Type: Trait— that’s approved but not yet implemented per CLAUDE.md). Reference syntax:tests/spec/traits/declaration.ori -
pattern_match.ori— Sum type with 3+ variants including mixed scalar and fat-pointer payloads (e.g.,A(x: int) | B(s: str) | C(xs: [int])), exercising tag dispatch and per-variant drops. tp-help identified this as a gap: mixed scalar/ref variants stress the decision tree codegen differently than uniform variants. Reference syntax:tests/valgrind/fat_matrix/f06_pattern_matching.ori,tests/valgrind/fat_matrix/f12_sum_payload.ori -
map_iteration.ori— Map creation with string keys, iteration over entries, map lookup, verifying RC for both keys and values during iteration. Reference syntax:tests/valgrind/iter_rc/map_str_iteration.ori,tests/valgrind/iter_rc/map_str_for_do.ori(active executable map examples; NOTtests/spec/types/map_types.oriwhich is a disabled TODO corpus) -
Verify each fixture:
cargo run -- run <fixture>produces expected exit code,cargo run -- build <fixture> -o /tmp/test_fixture && /tmp/test_fixtureproduces the same exit code -
Subsection close-out (06.1) — MANDATORY before starting 06.2:
- All tasks above are
[x]and verified - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection
- All tasks above are
06.2 Create ARC-interaction fixtures
File(s): diagnostics/fixtures/*.ori (new files)
These fixtures exercise ARC-specific interaction patterns that tp-help identified as blind spots. They are pass fixtures (exit 0) but are categorized as aims-heavy because they specifically stress AIMS pipeline phases.
Category: aims-heavy — all exit 0, but exercise AIMS-specific paths (COW, reuse, ? unwinding, recursion).
-
question_mark.ori—?operator propagation with fat values in scope (heapstr,[int], struct-with-fat-field). Must include:?onOption<str>returningNone,?onOption<[int]>returningSome, chained?with multiple fat locals in scope that must be cleaned up on early exit. tp-help identified this as mandatory ARC interaction coverage —?triggers early-exit unwinding that must drop all live fat values. Reference syntax:tests/valgrind/fat_matrix/f15_question_mark.ori -
recursive_tree.ori— Recursive function passing fat pointer types through recursive call frames: heapstrthroughNlevels,[int]through recursion, struct with fat field returned from recursive base case. Exercises stack-frame RC correctness across recursive depth. Reference syntax:tests/valgrind/fat_matrix/f16_recursion.ori -
generic_mono.ori— Generic function instantiated with multiple concrete types: scalar (int), heap string (str), list ([int]), and struct-with-fat-field. tp-help identified single-type generic coverage as insufficient — monomorphization must be tested across the type matrix to verify RC analysis is correct for each instantiation. Reference syntax:tests/valgrind/fat_matrix/f10_generics.ori -
large_aggregate.ori— Struct with 3+intfields (>16 bytes) passed to and returned from functions, exercising ABI compliance for large aggregates. Must verify that pass-by-reference codegen does not trigger unnecessary RC operations. Catches FastISel vs full pipeline regressions. Reference syntax:tests/valgrind/fat_matrix/f10_generics.ori(for struct patterns) -
cow_sharing.ori— COW sharing barrier exercise: create a list, alias it (shared), mutate through one alias (triggers COW clone), verify original is unchanged. Also: multi-fork (3+ references to same backing), and push-after-share on both sides. Exercisesis_uniquecheck and COW clone path. Reference syntax:tests/valgrind/cow/cow_list_push.ori -
Verify each fixture:
cargo run -- run <fixture>andcargo run -- build <fixture> -o /tmp/test_fixture && /tmp/test_fixtureproduce identical exit code 0 -
Verify each fixture under release build:
cargo run --release -- build <fixture> -o /tmp/test_fixture && /tmp/test_fixtureproduces exit code 0 -
Subsection close-out (06.2) — MANDATORY before starting 06.3:
- All tasks above are
[x]and verified - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection
- All tasks above are
06.3 Create expected-fail fixtures
File(s): diagnostics/fixtures/*.ori (new files)
tp-help identified that failure fixtures were “optional and underspecified” — this is a coverage gap. Diagnostic scripts must be validated in failure mode, not just success mode. These fixtures are mandatory.
Category: expected-fail — designed to trigger specific diagnostic failures.
-
leak.ori— Program that intentionally leaks an RC value (e.g., create a circular reference or allocate without drop path).ORI_CHECK_LEAKS=1must report a leak.diagnose-aot.shmust detect the leak. This validates that the leak detection path in diagnostic scripts actually works.- Safe Ori code cannot create true RC leaks (no circular references, ARC manages all allocations). Created best-effort fixture: panic with fat values in scope causes
diagnose-aot.shto report FAIL (execution exit=1) + WARN (RC Stats imbalanced: over-releases from incomplete cleanup).ORI_CHECK_LEAKS=1does not report leaks because the panic handler bypassesori_run_main’s return path where the leak check runs.
- Safe Ori code cannot create true RC leaks (no circular references, ARC manages all allocations). Created best-effort fixture: panic with fat values in scope causes
-
mismatch_compute.ori— Program that (via the mismatch-wrapper.sh infrastructure already indiagnostics/fixtures/) produces different interpreter vs AOT output. This validates thatdual-exec-debug.shcorrectly detects and reports mismatches with auto-diagnostic output. Note: The existingmismatch.ori+mismatch-wrapper.shalready serves this purpose — verify it is sufficient or extend it.- Verified: existing
mismatch.ori+mismatch-wrapper.shis sufficient.ORI_BIN=mismatch-wrapper.sh dual-exec-debug.sh mismatch.oricorrectly detects MISMATCH (stdout “INTERP” vs “AOT”), exits 1, and produces auto-diagnostic output. No separatemismatch_compute.orineeded.
- Verified: existing
-
Subsection close-out (06.3) — MANDATORY before starting 06.4:
- All tasks above are
[x]and verified - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection
- All tasks above are
06.4 Fixture matrix and categorization
File(s): diagnostics/fixtures/FIXTURES.md (new file)
tp-help identified scattered fixture knowledge as a LEAK — fixture names are repeated per-script in self-test with no single source of truth for what each fixture covers. This subsection creates the SSOT.
-
Create
diagnostics/fixtures/FIXTURES.mdwith a categorization table: Created with 18 fixtures (11 pass, 5 aims-heavy, 2 expected-fail) plusbuild-fail-parse.oriandmismatch-wrapper.shinfra entries. Includes full matrix table matching the plan specification. Also addedinfracategory for supporting infrastructure files. -
In
FIXTURES.md, document the self-test contract for each category:- pass:
ir-dump.sh(non-empty),arc-dump.sh(non-empty),diagnose-aot.sh(exit 0),dual-exec-debug.sh(MATCH),rc-stats.sh(produces output),bisect-passes.sh --rc-only(phase table + “Leak check: clean”) - aims-heavy: same as pass, PLUS
bisect-passes.sh --rc-onlyshows non-zero RC ops, AND feature-specific IR marker assertions - expected-fail:
diagnose-aot.sh/dual-exec-debug.shmust report failure, specific exit code + output pattern documented per fixture
- pass:
-
Subsection close-out (06.4) — MANDATORY before starting 06.5:
- All tasks above are
[x]and verified - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection
- All tasks above are
06.5 Update self-test.sh coverage
File(s): diagnostics/self-test.sh
-
Update the fixture existence check at the top of
self-test.sh(currently checkssimple.ori,clean.ori,chain.orionly) to also require all new fixtures. Group by category (pass, aims-heavy, expected-fail) with comments. SSOT note: fixture lists in self-test.sh must be verifiable againstdiagnostics/fixtures/FIXTURES.md— add a comment referencing FIXTURES.md as the canonical source. If feasible, parse FIXTURES.md to generate the fixture arrays rather than hardcoding them (eliminates the LEAK:scattered-knowledge risk identified by TPR). Updated withPASS_FIXTURES,AIMS_HEAVY_FIXTURES,EXPECTED_FAIL_FIXTURESarrays. Comment references FIXTURES.md. -
Add each pass fixture to the self-test matrix (ir-dump, arc-dump, diagnose-aot, dual-exec-debug, rc-stats, bisect-passes).
-
Add feature-specific assertions for aims-heavy and select pass fixtures:
closure.ori/closure_escape.ori:PartialApply(confirmed 5 occurrences each)pattern_match.ori:Switch(confirmed 6 occurrences)generic_mono.ori: “functions” in arc-dump header (confirmed “12 functions”)question_mark.ori:RcDec(confirmed 23 occurrences)cow_sharing.ori:RcInc(COW uniqueness via RC sharing;IsSharednot in ARC IR — runtime-level check)recursive_tree.ori: “functions” in arc-dump header (confirmed “5 functions”)
-
Add aims-heavy fixtures to self-test matrix with standard + feature-specific assertions.
-
Add each expected-fail fixture with specific assertions:
leak.ori: diagnose-aot exits non-zero + output contains “imbalance”. bisect-passes detects “exited with code 1” (panic bypasses runtime leak checker — RC_LIVE_COUNT never checked).mismatch.ori(via wrapper): dual-exec exits 1 + output contains “MISMATCH”
-
Handle
bisect-passes.shexit code semantics — do NOT assert exit 0 for pass/aims-heavy; assert “Phase” and “Leak check: clean” in output. -
Release build coverage — conditional section gated on
target/release/ori, runsdiagnose-aot.sh --releaseon closure.ori, iterator_break.ori, generic_mono.ori. SKIP if no release binary. -
Verify:
diagnostics/self-test.sh --verbosepasses — 159 passed, 0 failed -
Verify: all new self-test assertions pass in CI-equivalent conditions (clean build) — confirmed via test-all.sh (16954 passed, 0 failed)
-
Subsection close-out (06.5) — MANDATORY before starting 06.R:
- All tasks above are
[x]and verified - Update this subsection’s
statusin section frontmatter tocomplete - Run
/improve-toolingretrospectively on THIS subsection
- All tasks above are
06.R Third Party Review Findings
-
[TPR-06-001-codex][high]section-06-fixtures.md:142— Centralize fixture categories to remove LEAK and DRIFT. generic_mono.ori inconsistency, self-test.sh as second registry. Resolved: Fixed on 2026-04-10. Moved generic_mono.ori to 06.2 (aims-heavy), added SSOT note to 06.5 fixture list requiring FIXTURES.md cross-reference. -
[TPR-06-002-codex][medium]section-06-fixtures.md:48— Add large aggregate coverage promised by the goal. Resolved: Fixed on 2026-04-10. Added large_aggregate.ori fixture to 06.2 with >16B struct pattern and IR assertion. -
[TPR-06-003-codex][medium]section-06-fixtures.md:200— Complete expected-fail matrix with exact exit-code assertions. Resolved: Fixed on 2026-04-10. Added mismatch_compute.ori to FIXTURES.md table, replaced generic run_test_expect_fail with specific exit code + output pattern assertions. -
[TPR-06-001-gemini][medium]section-06-fixtures.md:195— Add mismatch_compute.ori to FIXTURES.md table. Resolved: Fixed on 2026-04-10. Same fix as [TPR-06-003-codex]. -
[TPR-06-002-gemini][low]section-06-fixtures.md:79— Harmonize generic_mono.ori categorization. Resolved: Fixed on 2026-04-10. Same fix as [TPR-06-001-codex] — moved to 06.2 aims-heavy. -
[TPR-06-003-gemini][medium]section-06-fixtures.md:214— Use —rc-only flag for bisect-passes self-test assertions. Resolved: Fixed on 2026-04-10. Updated 06.5 to specify--rc-onlyflag and explain why it’s load-bearing. -
[TPR-06-004-gemini][low]section-06-fixtures.md:180— Correct bisect-passes coverage for simple.ori in SSOT table. Resolved: Fixed on 2026-04-10. Changed simple.ori bisect-passes from “No (trivial)” to “Yes”. -
[TPR-06-005-gemini][medium]section-06-fixtures.md:225— Exercise leak.ori with bisect-passes.sh to verify detection. Resolved: Fixed on 2026-04-10. Added leak.ori to bisect-passes coverage with exit 1 assertion, updated table.
06.N Completion Checklist
- All subsections (06.1, 06.2, 06.3, 06.4, 06.5) complete
- All pass/aims-heavy fixtures compile and run under both interpreter and AOT
- All pass/aims-heavy fixtures produce identical results under debug and release builds
- Expected-fail fixtures correctly trigger diagnostic detection
-
diagnostics/fixtures/FIXTURES.mdexists and is the SSOT for fixture categorization -
diagnostics/self-test.shpasses with all new fixtures — 159 passed, 0 failed - Feature-specific assertions validate real IR markers, not just “non-empty”
-
timeout 150 ./test-all.shgreen — 16954 passed, 0 failed -
/tpr-reviewpassed — waived by user -
/impl-hygiene-reviewpassed — waived by user -
/improve-toolingsection-close sweep — per-subsection retrospectives covered all gaps; no cross-subsection patterns required new tooling - Strip plan annotations — zero annotations found for diagnostic-tooling-improvements plan