Section 01: Remove aims-compare + Create debug-release-compare

Status: Complete Goal: Replace the dead AIMS comparison scripts with a new debug-vs-release comparison tool that catches FastISel-only bugs and optimization-dependent behavioral divergences.

Success Criteria:

aims-compare.sh, aims-baseline.sh, aims-measure.sh deleted
New debug-release-compare.sh compiles + runs through both target/debug/ori and target/release/ori, comparing exit codes and stdout
On mismatch, auto-dumps LLVM IR from both builds for diffing
self-test.sh passes with new debug-release-compare test entries (28/28 passed)
Satisfies mission criterion: “aims-compare.sh removed; new debug-release-compare.sh functional”

Context: aims-compare.sh uses --features aims (line 177) which no longer exists — the aims feature was removed when AIMS became the default pipeline. The script fails immediately on any invocation. Codex verified: cargo build -p oric --features aims fails with “package does not contain this feature: aims”. Keeping the aims-compare name after AIMS is default is DRIFT per impl-hygiene.md. The debug-vs-release capability is genuinely useful since LLVM FastISel (debug) behaves differently from the full optimization pipeline (release).

Reference implementations:

Swift verifier: runs same input through debug and release SIL pipelines to catch optimization-dependent bugs

Depends on: None.

01.1 Remove dead AIMS comparison scripts and stale references

File(s): diagnostics/aims-compare.sh, diagnostics/aims-baseline.sh, diagnostics/aims-measure.sh, CLAUDE.md, .claude/rules/arc.md, queued plan files

These three scripts (~900 lines total) are dead code. aims-compare.sh (347 lines) fails at line 177 (--features aims removed). aims-baseline.sh (244 lines) and aims-measure.sh (292 lines) are orphaned support scripts only called by aims-compare.sh.

IMPORTANT — Semantic mismatch: The old aims-compare.sh compared output + RC counts across AIMS pipeline variants (behavioral + RC parity). The new debug-release-compare.sh compares debug vs release builds (exit codes + stdout + LLVM IR on mismatch). These are fundamentally different tools answering different questions. References to aims-compare.sh must NOT be blindly renamed — each consumer must be audited for whether debug-release-compare.sh is the correct replacement or whether the reference should simply be removed.

01.2 Create debug-release-compare.sh

File(s): diagnostics/_common.sh (extend), diagnostics/debug-release-compare.sh (new), diagnostics/self-test.sh, diagnostics/README.md, CLAUDE.md, .claude/rules/arc.md

Create a new script that compiles and runs a program through both debug and release builds, comparing behavioral output. This catches FastISel-only bugs (the >16B aggregate load issue) and optimization-dependent codegen divergences.

01.R Third Party Review Findings

[TPR-01-001-codex][medium] diagnostics/debug-release-compare.sh:117 — Return exit code 2 when either build fails. Evidence: DRIFT — script header defines exit 1=mismatch, 2=infra error, but compile-failure branches used exit 1. Fresh verification confirmed. Resolved: Fixed on 2026-04-09 in commit 337411e7. Both compile-failure branches now exit 2.
[TPR-01-002-codex][low] diagnostics/self-test.sh:255 — Exercise the infrastructure-error paths in self-test. Evidence: GAP — self-test only checked matching fixtures, —help, no-args. No compile-fail test existed. Resolved: Fixed on 2026-04-09 in commit 337411e7. Added run_test_exit_code helper and compile-failure test.
[TPR-01-003-codex][low] plans/diagnostic-tooling-improvements/section-01-aims-compare.md:35 — Synchronize plan status surfaces for Section 01 and Section 02. Evidence: DRIFT — Section 01 body said “Not Started” while frontmatter said complete. Overview and index also stale. Resolved: Fixed on 2026-04-09 in commit 337411e7. All plan surfaces synced.
[TPR-01-001-gemini][informational] diagnostics/debug-release-compare.sh:130 — Document omission of stderr comparison. Non-actionable observation. Stderr comparison intentionally omitted (exit code catches panics).
[TPR-01-002-gemini][informational] diagnostics/_common.sh:65 — Address reliance on SCRIPT_DIR convention. Non-actionable observation. SCRIPT_DIR convention is the established pattern for all diagnostic scripts.

01.N Completion Checklist

All subsections (01.1, 01.2) complete
diagnostics/self-test.sh passes (29/29)
timeout 150 ./test-all.sh green — no regressions (16,927 passed)
No references to aims-compare remain in active codebase surfaces (CLAUDE.md, .claude/rules/, diagnostics/ all clean)
/tpr-review passed — independent third-party review clean
/impl-hygiene-review passed — after TPR is clean
/improve-tooling section-close sweep — verify both subsection retrospectives ran; add any cross-subsection patterns