Section 01: JSON Output Protocol
Status: Not Started
Goal: Establish the structured communication protocol between the orchestrator (parent) and worker (child) processes. Add a --json flag to ori test that emits FileSummary as JSON to stdout. Add a BackendCrash variant to TestOutcome for worker signal deaths.
Success Criteria:
-
ori test --backend=llvm --json <file>emits sentinel-framed JSON for any input state (pass, fail, compile error) — framing ensuresprint()output doesn’t corrupt JSON -
BackendCrash(String)variant inTestOutcome—is_backend_crash()returns true,has_failures()returns true - Serde round-trip:
serialize(summary) |> deserialize == summaryfor all outcome variants - Satisfies mission criterion: structured JSON output for orchestrator consumption
Context: The orchestrator needs to parse per-file test results from the worker subprocess. The current output is human-readable text (print_test_summary in commands/test.rs) with format variations depending on --verbose. Parsing this is fragile — pass lines are omitted unless verbose, LLVM compile errors are suppressed unless verbose, and the summary line is aggregate-only. A --json flag provides the structured protocol the orchestrator needs.
IMPORTANT: stdout is not clean. The LLVM backend’s ori_print (in ori_rt/src/io/mod.rs) uses println!() which writes to stdout. If any Ori test calls print(), the output goes to the same stdout as the JSON. The JSON must be sentinel-framed (---ORI_JSON_BEGIN--- / ---ORI_JSON_END---) so the orchestrator can extract it reliably despite any print() pollution.
Reference implementations:
- Rust
cargo test --format=json: emits per-event JSON lines (one per test start/complete). Ori can use a simpler model — one JSON blob per file. - Zig
Compilation.zig:6338-6343: sidecar diagnostic file with structured format. Ori uses stdout instead (simpler, no temp file cleanup).
Depends on: Nothing.
01.1 BackendCrash Outcome Variant
File(s): compiler/oric/src/test/result/mod.rs
Add a new TestOutcome::BackendCrash(String) variant for tests whose LLVM worker process died by signal. This is distinct from LlvmCompileFail — compile failures are expected (codegen issues), but crashes are real failures that block the test gate.
TDD ordering: write tests FIRST (01.1.T below), verify they fail, then implement.
-
01.1.T — Tests first (in
compiler/oric/src/test/result/tests.rs):-
test_backend_crash_is_backend_crash—BackendCrash("msg".into()).is_backend_crash()returns true -
test_backend_crash_is_not_failed—BackendCrash("msg".into()).is_failed()returns false (distinct fromFailed) -
test_backend_crash_counted_in_file_summary—add_result(BackendCrash)incrementsbackend_crashcounter, NOTfailed -
test_backend_crash_file_has_failures—FileSummarywith onlyBackendCrashresults returnshas_failures() == true -
test_backend_crash_summary_has_failures—TestSummarywithbackend_crash > 0returnshas_failures() == true -
test_backend_crash_exit_code_1—TestSummarywithBackendCrashproducesexit_code() == 1 - Verify all 6 tests FAIL before implementing (they reference types/methods that don’t exist yet)
-
-
Add
BackendCrash(String)variant toTestOutcomeenum (result/mod.rsline 9-21):/// LLVM worker process crashed (SIGSEGV, SIGABRT, etc.). /// Distinct from LlvmCompileFail — crashes are real failures. BackendCrash(String), -
Add
is_backend_crash()predicate method alongside existingis_passed(),is_failed(), etc. (line 23-43) -
Add
backend_crash: usizefield toFileSummarystruct (line 107-128, afterllvm_compile_fail) -
Add
backend_crash: usizefield toTestSummarystruct (line 165-185, afterllvm_compile_fail) -
Update
add_result()inFileSummary(line 138, thematchonresult.outcome) to addBackendCrash(_) => self.backend_crash += 1 -
Update
has_failures()inFileSummary(line 159:self.failed > 0 || (!self.errors.is_empty() && !self.llvm_compile_error)) — add|| self.backend_crash > 0 -
Update
has_failures()inTestSummary(line 216:self.failed > 0 || self.error_files > 0) — add|| self.backend_crash > 0 -
Update
add_file()inTestSummary(line 192) — addself.backend_crash += summary.backend_crash -
Update
exit_code()inTestSummary—BackendCrashcounts flow throughhas_failures(), producing exit code 1 (no change needed ifhas_failures()is correctly updated above) -
Verify all 6 tests from 01.1.T now PASS
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (01.1) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-01.1 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 01.1: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
01.2 Serde Derives on Result Types
File(s): compiler/oric/src/test/result/mod.rs (new submodule json_protocol.rs), compiler/oric/Cargo.toml
Add Serialize/Deserialize derives to result types so they can be emitted as JSON. The Name type (interned identifier) needs special handling — it can’t be deserialized without an interner. Use string representation for JSON.
File size constraint: result/mod.rs is currently 327 lines. Adding ~80 lines of JSON mirror types would bring it to ~407 — under the 500-line limit but getting close. Extract JSON protocol types to a new compiler/oric/src/test/result/json_protocol.rs submodule for separation of concerns. Declare pub mod json_protocol; in result/mod.rs.
TDD ordering: write round-trip tests FIRST.
-
Add
serdeandserde_jsondependencies tocompiler/oric/Cargo.toml.serdeis in workspace deps (rootCargo.tomlline 74:serde = { version = "1", features = ["derive"] }).serde_jsonis NOT in workspace deps — add it to workspace[workspace.dependencies]first, then useworkspace = truein oric:# Root Cargo.toml [workspace.dependencies]: serde_json = "1" # compiler/oric/Cargo.toml [dependencies]: serde = { workspace = true } serde_json = { workspace = true } -
Create
compiler/oric/src/test/result/json_protocol.rswith JSON-serializable mirror types://! JSON wire protocol types for worker→orchestrator communication. use serde::{Deserialize, Serialize}; /// JSON-serializable test result for worker→orchestrator protocol. #[derive(Serialize, Deserialize, Debug, Clone)] pub struct JsonTestResult { pub name: String, pub targets: Vec<String>, pub outcome: JsonTestOutcome, pub duration_ms: u64, } #[derive(Serialize, Deserialize, Debug, Clone, PartialEq)] #[serde(tag = "type", content = "message")] pub enum JsonTestOutcome { Passed, Failed(String), Skipped(String), SkippedUnchanged, LlvmCompileFail(String), BackendCrash(String), } /// JSON-serializable file summary. #[derive(Serialize, Deserialize, Debug, Clone)] pub struct JsonFileSummary { pub path: String, pub results: Vec<JsonTestResult>, pub passed: usize, pub failed: usize, pub skipped: usize, pub llvm_compile_fail: usize, pub backend_crash: usize, pub duration_ms: u64, pub errors: Vec<String>, pub llvm_compile_error: bool, } /// Sentinel markers for framing JSON in stdout (robust against Ori print() pollution). pub const JSON_BEGIN_SENTINEL: &str = "---ORI_JSON_BEGIN---"; pub const JSON_END_SENTINEL: &str = "---ORI_JSON_END---"; -
Declare
pub mod json_protocol;inresult/mod.rs -
01.2.T — Tests first (in
compiler/oric/src/test/result/tests.rs):-
test_json_outcome_round_trip_all_variants— serialize each of the 6JsonTestOutcomevariants to JSON, deserialize back, compare equal -
test_json_file_summary_round_trip— serialize aJsonFileSummarywith mixed outcomes, deserialize back, verify field equality -
test_file_summary_to_json_correct_fields— create aFileSummarywith known values, callto_json(), verify all fields match -
test_json_file_summary_into_file_summary— round-tripFileSummary→to_json()→into_file_summary()→ compare counters and outcome types -
test_empty_results_json— file with 0 tests produces valid JSON withresults: []and all counters at 0 - Verify tests fail before implementing conversion methods
-
-
Add
FileSummary::to_json(&self, interner: &StringInterner) -> JsonFileSummaryconversion injson_protocol.rs(resolvesName→Stringvia interner) -
Add
JsonFileSummary::into_file_summary(self, interner: &StringInterner) -> FileSummaryreverse conversion (re-internsString→Name). Note: re-interning creates newNamevalues in the orchestrator’s interner — this is correct since the worker’sNamevalues are process-local. -
Exit code 2 (no tests) representation: When a file has no LLVM-eligible tests, the worker emits a
JsonFileSummarywithresults: []and all counters at 0. The orchestrator treats this as a no-op (not a failure, not a crash). -
Verify all tests from 01.2.T now PASS
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (01.2) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-01.2 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 01.2: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
01.3 —json Flag and JSON Emission
File(s): compiler/oric/src/main.rs (lines 118-149), compiler/oric/src/commands/test.rs, compiler/oric/src/test/runner/mod.rs (lines 38-56)
Add --json flag to the test command that emits JsonFileSummary to stdout instead of human-readable output. When --json is active, all human-readable output (progress, errors, summaries) goes to stderr or is suppressed. Only the sentinel-framed JSON blob goes to stdout.
Design decision: Sentinel framing (Option C). Worker emits ---ORI_JSON_BEGIN--- / ---ORI_JSON_END--- sentinels around the JSON blob. Orchestrator extracts content between sentinels, ignoring any Ori print() output that pollutes stdout. This is simpler than temp files (Option B) and more robust than raw stdout parsing (Option A). Sentinel constants live in json_protocol.rs (defined in 01.2).
TDD ordering: write integration tests, verify they fail (no --json flag yet), then implement.
-
Add
json: boolfield toTestRunnerConfiginrunner/mod.rs(line 43, inside the struct). Also updateDefaultimpl (line 58-68) withjson: false. The#[expect(clippy::struct_excessive_bools)]on line 39 remains valid (now 5 bools:verbose,parallel,coverage,incremental,json). -
Parse
--jsonflag inmain.rstest command block (lines 118-149, inside the"test"match arm):} else if arg == "--json" { config.json = true; } -
In
commands/test.rs:run_tests()(line 10), whenconfig.jsonis true:- Suppress human-readable output (no
print_test_summary, noprint_summary_stats) - After
runner.run()completes, serialize per-fileFileSummaryas JSON to stdout - Use sentinel framing constants from
json_protocol.rs:
use oric::test::result::json_protocol::{JSON_BEGIN_SENTINEL, JSON_END_SENTINEL}; if config.json { let interner = runner.interner(); let json_summaries: Vec<JsonFileSummary> = summary.files .iter() .map(|f| f.to_json(interner)) .collect(); println!("{JSON_BEGIN_SENTINEL}"); println!("{}", serde_json::to_string(&json_summaries).unwrap()); println!("{JSON_END_SENTINEL}"); } else { // existing human-readable output } - Suppress human-readable output (no
-
stdout pollution note: Ori
print()usesprintln!()inori_rtwhich writes to stdout. Tracing already goes to stderr. The sentinel framing handlesprint()pollution — the orchestrator (Section 02) scans for sentinels and ignores everything else. No runtime changes needed. -
01.3.T — Integration tests (in
compiler/oric/tests/phases/orcompiler/oric/src/test/runner/tests.rs):-
test_json_flag_emits_sentinel_framed_json— spawnori test --backend=llvm --json tests/spec/types/primitives.oriviaCommand::new(current_exe), capture stdout, verify---ORI_JSON_BEGIN---and---ORI_JSON_END---are present, extract content between them, parse asVec<JsonFileSummary>, verify pass count > 0 -
test_json_flag_multi_test_file— use a file with many tests (tests/spec/inference/unification.ori), verify per-test results in JSON -
test_json_flag_stdout_pollution_resilience— create/use a test file that callsprint(), verify sentinel-framed JSON is extractable despite Oriprint()output on stdout. This is the critical robustness test. -
test_json_flag_compile_error_file— use a file with type errors, verify JSON containsLlvmCompileFailoutcomes -
test_no_json_flag_unchanged— verifyori test --backend=llvm tests/spec/types/primitives.ori(no--json) produces the same human-readable output as before (regression guard)
-
-
Verify tests fail before implementing, then pass after
-
/tpr-reviewpassed — independent review found no critical or major issues (or all findings triaged) -
/impl-hygiene-reviewpassed — hygiene review clean. MUST run AFTER/tpr-reviewis clean. -
Subsection close-out (01.3) — MANDATORY before starting the next subsection. Run
/improve-toolingretrospectively on THIS subsection’s debugging journey (per.claude/skills/improve-tooling/SKILL.md“Per-Subsection Workflow”): whichdiagnostics/scripts you ran, where you addeddbg!/tracingcalls, where output was hard to interpret, where test failures gave unhelpful messages, where you ran the same command sequence repeatedly. Forward-look: what tool/log/diagnostic would shorten the next regression in this code path by 10 minutes? Implement improvements NOW (zero deferral) and commit each via SEPARATE/commit-pushusing a valid conventional-commit type (build(diagnostics): ... — surfaced by section-01.3 retrospective—build/test/chore/ci/docsare valid;tools(...)is rejected by the lefthook commit-msg hook). Mandatory even when nothing felt painful. If genuinely no gaps, document briefly: “Retrospective 01.3: no tooling gaps”. Update this subsection’sstatusin section frontmatter tocomplete. -
/sync-claudesection-close doc sync — verify Claude artifacts across all section commits. Map changed crates to rules files, check CLAUDE.md, canon.md. Fix drift NOW. -
Repo hygiene check — run
diagnostics/repo-hygiene.sh --checkand clean any detected temp files.
01.R Third Party Review Findings
- None.
01.N Completion Checklist
-
BackendCrashvariant added toTestOutcomewithis_backend_crash()predicate -
has_failures()returns true forBackendCrash— crashes block the test gate -
backend_crashcounter added toFileSummaryandTestSummary - JSON mirror types in
compiler/oric/src/test/result/json_protocol.rs(new submodule, under 200 lines) -
FileSummary::to_json()andJsonFileSummary::into_file_summary()conversions work -
--jsonflag parsed in CLI and routed throughTestRunnerConfig -
ori test --backend=llvm --json <file>emits sentinel-framed JSON to stdout (robust against Oriprint()output) - TDD verified: all tests written before implementation, verified failing, then passing
- Unit tests: 6 BackendCrash tests, 5 JSON round-trip/conversion tests (in
result/tests.rs) - Integration tests: 5 tests covering JSON output, pollution resilience, compile errors, regression (in
runner/tests.rsortests/phases/) -
result/mod.rsstays under 500 lines (currently 327 + ~10 lines for new variant/field/arm) -
json_protocol.rsstays under 200 lines -
timeout 150 ./test-all.shpasses — no regressions (JSON flag is opt-in, default behavior unchanged) -
./clippy-all.shpasses - Plan annotation cleanup:
bash .claude/skills/impl-hygiene-review/plan-annotations.sh --plan 01returns 0 annotations - Plan sync — update plan metadata
-
/tpr-reviewpassed -
/impl-hygiene-reviewpassed -
/improve-toolingretrospective completed — MANDATORY at section close, after both reviews are clean. Reflect on the section’s debugging journey (whichdiagnostics/scripts you ran, which command sequences you repeated, where you added ad-hocdbg!/tracingcalls, where output was hard to interpret) and identify any tool/log/diagnostic improvement that would have made this section materially easier OR that would help the next section touching this area. Implement every accepted improvement NOW (zero deferral) and commit each via SEPARATE/commit-push. The retrospective is mandatory even when nothing felt painful — that is exactly when blind spots accumulate. See.claude/skills/improve-tooling/SKILL.md“Retrospective Mode” for the full protocol.
Exit Criteria: ori test --backend=llvm --json tests/spec/types/primitives.ori emits a sentinel-framed JSON JsonFileSummary to stdout with correct pass/fail/skip counts. The framing is robust against Ori print() output on stdout. BackendCrash variant exists and is counted as a real failure. All existing tests pass unchanged (JSON is opt-in). Serde round-trip test verifies all 6 JsonTestOutcome variants serialize and deserialize correctly.