100%

Section 02: Hash Stability Testing

types produce identical Merkle hashes across independent Pool instances, regardless of interning order, pool size, or Idx assignment.

Why this matters: Merkle hashing is the foundation for all downstream optimizations. If hash stability has even one edge case failure, every consumer (hash-forwarded signatures, hash-first imports, backend integration) silently produces wrong results. These bugs are catastrophic and near-impossible to diagnose without explicit stability tests. This section must be complete and passing before any downstream section begins.

This section depends on Section 01 (Merkle hash implementation). It blocks Sections 03-07.


02.1 Cross-Pool Stability Tests — COMPLETE

hash across two independently constructed pools with different interning histories.

Test Strategy: For each test, create two fresh pools. Intern different “noise” types in Pool B first (to shift Idx assignments), then intern the target type in both pools and verify hash equality.

File: compiler/ori_types/src/pool/tests.rs

Test Matrix:

Test NameTypeDepthWhy
merkle_primitivesint, str, bool, all 120Baseline — fixed indices
merkle_simple_containerList<int>, Option<str>, Set<bool>1Child-in-data path
merkle_simple_container_shiftedList<int> with noise types1Different Idx, same hash
merkle_two_childMap<str, int>, Result<bool, str>1Two-child extra path
merkle_function_nullary() -> int1Function with 0 params
merkle_function_unary(int) -> str1Function with 1 param
merkle_function_multi(int, str, bool) -> float1Function with N params
merkle_tuple_pair(int, str)1Tuple with 2 elements
merkle_tuple_triple(int, str, bool)1Tuple with 3 elements
merkle_nested_listList<List<int>>2Depth 2, same path
merkle_nested_mapMap<str, List<Option<int>>>3Depth 3, mixed containers
merkle_function_complex(Map<str, List<int>>) -> Option<bool>3Complex function sig
merkle_structstruct Foo { x: int, y: str }1Struct with fields
merkle_struct_nestedstruct Bar { inner: List<int> }2Struct referencing container
merkle_enum_unitenum Dir { N, S, E, W }0Enum with unit variants
merkle_enum_fieldsenum Result { Ok(int), Err(str) }1Enum with field variants
merkle_appliedFoo<int, str>1Applied generic type
merkle_schemeforall T. (T) -> T1Quantified type scheme
merkle_depth_4List<Map<str, Option<List<int>>>>4Realistic generic depth
merkle_depth_5Function returning depth-45Max expected depth

Test template:

#[test]
fn merkle_simple_container_shifted() {
    let mut p1 = Pool::new();
    let mut p2 = Pool::new();

    // Shift p2's indices by interning unrelated types
    let _ = p2.list(Idx::FLOAT);
    let _ = p2.map(Idx::STR, Idx::BOOL);
    let _ = p2.option(Idx::CHAR);
    let _ = p2.function(&[Idx::INT], Idx::STR);

    // Target type: List<int>
    let idx1 = p1.list(Idx::INT);
    let idx2 = p2.list(Idx::INT);

    assert_ne!(idx1, idx2, "Indices must differ (p2 shifted)");
    assert_eq!(p1.hash(idx1), p2.hash(idx2),
        "List<int> Merkle hash must be identical across pools");
}

Non-primitive child test (the critical case):

#[test]
fn merkle_container_of_struct_shifted() {
    let mut p1 = Pool::new();
    let mut p2 = Pool::new();

    // Shift p2 heavily
    for i in 0..20 {
        let _ = p2.list(Idx::from_raw(i % 12));  // 20 noise entries
    }

    // Create struct in both pools — will get DIFFERENT Idx values
    let name = Name::from("Point");
    let fields_1 = vec![
        (Name::from("x"), Idx::INT),
        (Name::from("y"), Idx::INT),
    ];
    let fields_2 = fields_1.clone();

    let struct_1 = p1.struct_type(name, &fields_1);
    let struct_2 = p2.struct_type(name, &fields_2);

    // Struct Idx values differ
    assert_ne!(struct_1, struct_2);
    // Struct hashes must be identical (same name, same field structure)
    assert_eq!(p1.hash(struct_1), p2.hash(struct_2));

    // NOW: List<Point> — the critical test
    let list_1 = p1.list(struct_1);
    let list_2 = p2.list(struct_2);

    // List Idx values differ
    assert_ne!(list_1, list_2);
    // List hashes must be identical (List + child hash of Point)
    assert_eq!(p1.hash(list_1), p2.hash(list_2),
        "List<Point> hash must be stable: child struct at different Idx values");
}

This is the test that the CURRENT hash function fails. It’s the litmus test for Merkle correctness: a container of a non-primitive type at different Idx positions.

Exit Criteria:

  • All 20+ test cases in the matrix passing — 22 Merkle tests (2026-02-26)
  • At least one test per tag category (primitive, simple, two-child, complex, named, variable, scheme) (2026-02-26)
  • At least one “shifted” test per category (noise types to move Idx values) (2026-02-26)
  • Non-primitive child test (List<UserStruct>) passing — merkle_container_of_struct_shifted (2026-02-26)
  • Depth 4-5 test passing — merkle_depth_4_stable + merkle_depth_5_function_stable (2026-02-26)

02.2 Collision Detection & Distribution — COMPLETE

and that the hash distribution is uniform enough for FxHashMap performance.

Approach: Generate a large set of distinct types (1,000-10,000) in a single pool, collect all hashes, and verify:

  1. No duplicate hashes for distinct types
  2. Hash distribution is roughly uniform (no clustering)

Test:

#[test]
fn merkle_no_collisions_realistic_types() {
    let mut pool = Pool::new();
    let mut hashes = FxHashSet::default();
    let mut count = 0;

    // Generate ~2000 distinct types
    let primitives = [Idx::INT, Idx::FLOAT, Idx::BOOL, Idx::STR,
                      Idx::CHAR, Idx::BYTE, Idx::UNIT];

    // Level 1: containers of primitives (7 × 7 = 49)
    let mut level1 = Vec::new();
    for &p in &primitives {
        level1.push(pool.list(p));
        level1.push(pool.option(p));
        level1.push(pool.set(p));
        level1.push(pool.iterator(p));
        level1.push(pool.range(p));
        // etc.
    }

    // Level 2: containers of level1 types
    // Level 3: functions, tuples, maps of level2

    // ... (generate ~2000 types)

    // Collect all hashes
    for idx in 0..pool.len() {
        let h = pool.hash(Idx::from_raw(idx as u32));
        let is_new = hashes.insert(h);
        assert!(is_new, "Hash collision at index {idx}: hash 0x{h:016x}");
        count += 1;
    }

    // Verify we generated enough types
    assert!(count >= 500, "Need at least 500 distinct types for meaningful collision test");
}

Distribution test (optional, for performance confidence):

#[test]
fn merkle_hash_distribution_uniform() {
    let mut pool = Pool::new();
    // ... generate types ...

    // Check that top/bottom bits have reasonable entropy
    let mut top_byte_counts = [0u32; 256];
    for idx in 0..pool.len() {
        let h = pool.hash(Idx::from_raw(idx as u32));
        top_byte_counts[(h >> 56) as usize] += 1;
    }

    // Chi-squared test or simpler: no bucket has >5x the average
    let avg = pool.len() as f64 / 256.0;
    for &count in &top_byte_counts {
        assert!((count as f64) < avg * 5.0,
            "Hash distribution skewed: bucket has {count} entries (avg {avg:.1})");
    }
}

Exit Criteria:

  • No collisions in 500+ distinct types — merkle_no_collisions_500_plus_types (578 types) (2026-02-26)
  • Hash distribution test passes (no extreme clustering) — merkle_hash_distribution_uniform (2026-02-26)
  • Test runs in <100ms (not a bottleneck) — runs in <10ms (2026-02-26)

02.3 Structural Equality Verification — COMPLETE

and that structural equality implies hash equality (no false negatives), by cross-checking Merkle hashes against a reference structural comparison.

Approach: Implement a structural_eq(p1: &Pool, idx1: Idx, p2: &Pool, idx2: Idx) -> bool function that recursively compares types by structure (tag + children), ignoring Idx values. Use it to verify that hash(idx1) == hash(idx2)structural_eq(idx1, idx2) for a large type set.

/// Reference implementation: structural equality across pools.
/// Used ONLY for testing — not for production (O(tree_depth) per comparison).
#[cfg(test)]
fn structural_eq(p1: &Pool, idx1: Idx, p2: &Pool, idx2: Idx) -> bool {
    let tag1 = p1.tag(idx1);
    let tag2 = p2.tag(idx2);
    if tag1 != tag2 {
        return false;
    }

    match tag1 {
        // Primitives: tag equality is sufficient
        tag if tag.is_merkle_leaf() && tag.is_primitive() => true,

        // Simple containers: compare child
        tag if tag.has_child_in_data() => {
            let child1 = Idx::from_raw(p1.data(idx1));
            let child2 = Idx::from_raw(p2.data(idx2));
            structural_eq(p1, child1, p2, child2)
        }

        // ... (recursive cases for all tags)
        _ => todo!()
    }
}

Test:

#[test]
fn merkle_hash_matches_structural_equality() {
    let mut p1 = Pool::new();
    let mut p2 = Pool::new();

    // Create shifted pools with ~100 types each
    // For each type in p1, create the same type in p2
    // Verify: hash equality ↔ structural equality

    let types_1: Vec<Idx> = /* ... generate types in p1 ... */;
    let types_2: Vec<Idx> = /* ... same types in p2, different order ... */;

    for (&idx1, &idx2) in types_1.iter().zip(&types_2) {
        let hash_eq = p1.hash(idx1) == p2.hash(idx2);
        let struct_eq = structural_eq(&p1, idx1, &p2, idx2);
        assert_eq!(hash_eq, struct_eq,
            "Hash/structural mismatch for {:?} vs {:?}",
            p1.format(idx1), p2.format(idx2));
    }
}

Exit Criteria:

  • structural_eq reference implementation covering all tag categories — exhaustive match on all 37 Tag variants (2026-02-26)
  • 100+ types verified: hash equality ↔ structural equality — merkle_hash_matches_structural_equality (170+ pairs) (2026-02-26)
  • No false positives (different types with same hash) — merkle_structural_neq_implies_hash_neq (2026-02-26)
  • No false negatives (same type with different hash) — verified in merkle_hash_matches_structural_equality (2026-02-26)

02.4 Debug Tooling — COMPLETE

Tools:

  1. Pool::format_hash(idx: Idx) -> String — Returns a human-readable hash breakdown:

    List<int> @ Idx(15): hash=0x1a2b3c4d5e6f7890
      tag=List, child_hash=0x0000000000000001 (INT)
  2. Pool::debug_hash_tree(idx: Idx) -> String — Recursive hash tree:

    (int, str) -> bool @ hash=0xABCD...
      Function(2 params, 1 ret)
        param[0]: int @ hash=0x0001 (primitive)
        param[1]: str @ hash=0x0003 (primitive)
        return:   bool @ hash=0x0002 (primitive)
  3. debug_assert! in hash-forwarded operations — When a hash lookup succeeds, optionally verify the found type structurally matches (cfg(debug_assertions) only).

Files:

  • compiler/ori_types/src/pool/format/mod.rs — add hash formatting methods
  • compiler/ori_types/src/pool/mod.rs — add debug assertions

Exit Criteria:

  • format_hash prints tag + hash + child hashes for any type (2026-02-26)
  • debug_hash_tree prints recursive breakdown (2026-02-26)
  • Debug assertions in hash lookups (debug builds only) — debug_assert_eq! in intern() and intern_complex() (2026-02-26)
  • Tooling used in at least one failing test to demonstrate diagnostic value — merkle_container_of_struct_shifted uses format_hash in assertions (2026-02-26)

Section 02 Completion Checklist

  • 20+ cross-pool stability tests covering all tag categories (02.1) — 22 Merkle stability tests (2026-02-26)
  • Non-primitive child test (container of user type) passing (02.1) — merkle_container_of_struct_shifted (2026-02-26)
  • Collision detection with 500+ types, zero collisions (02.2) — 578 types, 0 collisions (2026-02-26)
  • Structural equality cross-check for 100+ types (02.3) — 170+ pairs verified (2026-02-26)
  • Debug tooling (format_hash, debug_hash_tree) implemented (02.4) (2026-02-26)
  • All tests in pool/tests.rs passing — 29 pool tests, 15 format tests, 10,614 total (2026-02-26)
  • Section 02 blocking gate: ALL tests must pass before Sections 03-07 (2026-02-26)