Skip to content

Conversation

@insipx
Copy link
Contributor

@insipx insipx commented Dec 8, 2025

Deduplicate and recursively resolve orphaned envelopes in xmtp_api_d14n::protocol::InMemoryCursorStore to test icebox ordering

Convert the icebox to a HashSet with custom Eq/Hash on OrphanedEnvelope that ignore payload bytes, add recursive child resolution without removal, and adjust ordering to sort by timestamp once before the loop. Tests add regression cases, new formatting helpers, and builder utilities for orphans and cursors.

📍Where to Start

Start with the ordering flow in Ordered<T,R,S>::order in xmtp_api_d14n/src/protocol/order.rs, then review icebox behavior in InMemoryCursorStore and resolve_children_inner in xmtp_api_d14n/src/protocol/in_memory_cursor_store.rs, and the OrphanedEnvelope equality/hash changes in xmtp_proto/src/types/orphaned_envelope.rs.


Macroscope summarized 92a59ae.

@insipx
Copy link
Contributor Author

insipx commented Dec 8, 2025

@claude
Copy link

claude bot commented Dec 8, 2025

Claude finished @insipx's task —— View job


Code Review Complete

Review Date: December 15, 2025, 19:30 UTC

I've reviewed this PR and have identified several issues that haven't been covered by the existing Macroscope comments. Below are my findings:


🔴 Critical Issues

1. Potential False Positive in is_child_of with sequence_id==0 (xmtp_proto/src/types/orphaned_envelope.rs:57-60)

pub fn is_child_of(&self, cursor: &Cursor) -> bool {
    self.depends_on.contains_key(&cursor.originator_id)
        && self.depends_on.get(&cursor.originator_id) == cursor.sequence_id
}

Issue: While the contains_key check prevents incorrect matches for missing originators, there's a subtle edge case:

  • If cursor.sequence_id == 0 and depends_on DOES contain the originator with value 0, this returns true
  • However, sequence_id == 0 typically means "no message" or "initial state" in vector clocks
  • This could lead to false positives when matching dependencies

Context: Macroscope flagged similar concerns. While the implementation is technically correct, the semantics of sequence_id == 0 should be clarified.

Recommendation: Either document that sequence_id == 0 is a valid dependency, or add an explicit check to reject it if it's not.


2. Ordering Change May Break Causal Sort (xmtp_api_d14n/src/protocol/order.rs:134-135)

The refactored order method changes the ordering from prepending to appending:

New behavior:

self.envelopes.append(&mut resolved);
self.envelopes.append(&mut missing);

Problem: The old implementation prepended resolved envelopes and re-ran timestamp sort inside the loop. The new implementation:

  1. Appends resolved envelopes to the end
  2. Does NOT re-run timestamp sort inside the loop (only at line 125)

Concern: Causal sort algorithms typically require dependencies to appear before dependents in the input list. If causal sort expects this ordering, appending resolved dependencies to the end could break correctness.

Recommendation:

  • Add a comment explaining why appending is correct for this causal sort implementation
  • Verify with tests that the causal sort doesn't depend on resolved dependencies appearing before dependent envelopes

⚠️ Medium Issues

3. Hash Comment Needs Update (xmtp_proto/src/types/orphaned_envelope.rs:23-32)

The comment says "ignore the payload field" but the implementation now hashes depends_on:

// comment: "ignore the payload field"
impl Hash for OrphanedEnvelope {
    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
        self.cursor.hash(state);
        self.group_id.hash(state);
        self.depends_on.hash(state);  // This is now included!
    }
}

Issue: The comment is outdated. Based on @insipx's response to the review, including depends_on in the hash is intentional.

Recommendation: Update the comment to clarify that both payload is ignored AND depends_on is included for deduplication purposes.


4. Variable Shadowing in recover_lost_children (xmtp_api_d14n/src/protocol/order.rs:68, 83)

let cursors: Vec<_> = ... // line 68
// ...
let cursors: HashSet<Cursor> = HashSet::from_iter(cursors); // line 83

Issue: Variable shadowing makes the code harder to maintain and could lead to errors during future modifications.

Recommendation: Use distinct variable names (e.g., cursor_vec and cursor_set or cursors_set).


5. Test Helper Excludes Unavailable from all_envelopes (xmtp_api_d14n/src/protocol/order.rs:295-300)

Macroscope correctly identified this issue. all_envelopes() doesn't include unavailable():

fn all_envelopes(&self) -> Vec<TestEnvelope> {
    self.available()
        .into_iter()
        .chain(self.missing.envelopes.clone())
        .chain(self.returned())
        .collect()
}

Impact: has_unavailable_in_dependency_chain won't find transitive dependencies that are unavailable, potentially causing incomplete validation in tests.

Recommendation: Add .chain(self.unavailable()) to include all envelopes for complete transitive checking.


ℹ️ Minor Issues / Code Quality

6. Unwrap Chains in Test Assertions (xmtp_api_d14n/src/protocol/order.rs:209, 219)

let clock = cursor.get(&env.topic().unwrap()).unwrap();

Issue: Macroscope correctly flagged this - panics before reaching prop_assert! won't include the caller's message, and #[track_caller] won't help.

Recommendation: Use .expect() with descriptive messages:

let clock = cursor
    .get(&env.topic().expect(message))
    .expect(message);

7. Test Timing Issue in orders_with_recovered_children (xmtp_api_d14n/src/protocol/order.rs:474-485)

Macroscope identified that resolve_children is called AFTER order() completes:

ordered.order().now_or_never()...  // line 487
let (second_ordering_pass, ..) = ordered.into_parts();  // line 491

// Then resolve_children is called on line 479 (in variable capture above)

Issue: If order() already recovered and removed children from the icebox, resolve_children will return empty results, making had_children false and skipping important assertions (lines 502-529).

Recommendation: Capture resolved children BEFORE calling order() for accurate test validation.


✅ Positive Observations

  1. Excellent test coverage: The proptest property-based tests are comprehensive and test complex dependency resolution scenarios
  2. Helpful debug utilities: The new DebugDisplay trait and format_enumerated/format_list methods will make debugging test failures much easier
  3. Good deduplication: Switching from Vec to HashSet for the icebox prevents duplicate orphans
  4. Recursive child resolution: The recursive approach in resolve_children_inner correctly handles transitive dependencies

Summary

Critical: Issue #2 (ordering change) should be verified to ensure it doesn't break causal sort correctness.

Medium: Issues #3-5 should be addressed for code correctness and maintainability.

Minor: Issues #6-7 are test improvements that would make failures easier to debug.

Overall, this is solid work on a complex dependency resolution system. The main concerns are around the ordering semantics and ensuring the tests accurately validate the behavior.


@insipx insipx marked this pull request as draft December 8, 2025 23:32
@insipx insipx changed the title more test cleanup DRY up ordering tests Dec 8, 2025
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from 798f6ab to c7cbc5c Compare December 9, 2025 20:23
@insipx insipx force-pushed the insipx/test-ordering branch from 954055a to d39a146 Compare December 9, 2025 20:23
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from c7cbc5c to f5c5824 Compare December 9, 2025 20:30
@insipx insipx force-pushed the insipx/test-ordering branch from d39a146 to 69942d5 Compare December 9, 2025 20:30
@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

❌ Patch coverage is 94.05405% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.71%. Comparing base (7f5684f) to head (92a59ae).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
..._api_d14n/src/protocol/utils/test/test_envelope.rs 76.47% 8 Missing ⚠️
xmtp_api_d14n/src/protocol/order.rs 96.96% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2916      +/-   ##
==========================================
+ Coverage   73.59%   73.71%   +0.11%     
==========================================
  Files         405      405              
  Lines       51710    51883     +173     
==========================================
+ Hits        38057    38245     +188     
+ Misses      13653    13638      -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@insipx insipx force-pushed the insipx/test-ordering branch from 69942d5 to 461cc93 Compare December 9, 2025 20:36
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from f5c5824 to 2cfa1e1 Compare December 9, 2025 20:36
@insipx insipx force-pushed the insipx/test-ordering branch from 461cc93 to c3c2c48 Compare December 9, 2025 20:39
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from 2cfa1e1 to 8fcb993 Compare December 9, 2025 20:39
@insipx insipx marked this pull request as ready for review December 9, 2025 20:39
@insipx insipx changed the title DRY up ordering tests test icebox ordering Dec 9, 2025
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from 8fcb993 to 949d056 Compare December 10, 2025 16:20
@insipx insipx force-pushed the insipx/test-ordering branch from c3c2c48 to cb3d961 Compare December 10, 2025 16:20
@insipx insipx mentioned this pull request Dec 10, 2025
@insipx insipx force-pushed the insipx/test-ordering branch from 241233c to f2e74fa Compare December 11, 2025 19:43
@insipx insipx force-pushed the insipx/icebox-in-ordering branch 2 times, most recently from 3b21dc8 to 9a5370b Compare December 11, 2025 20:42
@insipx insipx force-pushed the insipx/test-ordering branch from f2e74fa to 13e19c2 Compare December 11, 2025 20:42
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from 9a5370b to debc116 Compare December 11, 2025 21:02
@insipx insipx force-pushed the insipx/test-ordering branch from 13e19c2 to 7eec7fa Compare December 11, 2025 21:02
@insipx insipx force-pushed the insipx/test-ordering branch from 7eec7fa to 93c4fb2 Compare December 15, 2025 17:18
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from debc116 to 5ab6d8e Compare December 15, 2025 17:18
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from 5ab6d8e to 7840c4a Compare December 15, 2025 17:53
@insipx insipx force-pushed the insipx/test-ordering branch 2 times, most recently from 00361b4 to ecdd510 Compare December 15, 2025 19:08
@insipx insipx force-pushed the insipx/icebox-in-ordering branch from 7840c4a to 38c66a7 Compare December 15, 2025 19:08
Base automatically changed from insipx/icebox-in-ordering to main December 15, 2025 19:21
@insipx insipx force-pushed the insipx/test-ordering branch 3 times, most recently from 7d5e9ae to d9096ee Compare December 15, 2025 19:27
@insipx insipx enabled auto-merge (squash) December 15, 2025 19:27
@insipx insipx disabled auto-merge December 15, 2025 19:39
@insipx insipx force-pushed the insipx/test-ordering branch from d9096ee to 92a59ae Compare December 15, 2025 20:15
@insipx insipx enabled auto-merge (squash) December 15, 2025 20:15
@insipx insipx merged commit 6308da5 into main Dec 15, 2025
17 checks passed
@insipx insipx deleted the insipx/test-ordering branch December 15, 2025 20:27
rygine pushed a commit that referenced this pull request Dec 28, 2025
<!-- Macroscope's pull request summary starts here -->
<!-- Macroscope will only edit the content between these invisible
markers, and the markers themselves will not be visible in the GitHub
rendered markdown. -->
<!-- If you delete either of the start / end markers from your PR's
description, Macroscope will post its summary as a comment. -->
### Refactor icebox ordering by deduplicating `OrphanedEnvelope` in
`protocol.InMemoryCursorStore` and adjusting ordering tests for icebox
behavior
Replace the icebox `Vec` with a `HashSet` in `InMemoryCursorStore`,
adjust recursive child resolution, and update ordering and tests to
remove per-iteration timestamp sorting and to assert dependency
resolution. Add custom equality for `OrphanedEnvelope` that ignores
payload and introduce small test utilities for formatting and cursor
iteration.

#### 📍Where to Start
Start with `Ordered<T,R,S>::order` in
[file:xmtp_api_d14n/src/protocol/order.rs], then review
`InMemoryCursorStore` and `resolve_children_inner` in
[file:xmtp_api_d14n/src/protocol/in_memory_cursor_store.rs].

----
<!-- Macroscope's review summary starts here -->

<details>
<summary>📊 <a href="https://app.macroscope.com">Macroscope</a>
summarized 74be9b4. 17 files reviewed, 14 issues evaluated, 12 issues
filtered, 0 comments posted</summary>

### 🗂️ Filtered Issues
<details>
<summary>xmtp_api_d14n/src/protocol/in_memory_cursor_store.rs — 0
comments posted, 3 evaluated, 3 filtered</summary>

- [line
12](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/in_memory_cursor_store.rs#L12):
Changing `icebox` from `Vec<OrphanedEnvelope>` to
`HashSet<OrphanedEnvelope>` can cause data loss. The `Hash` and
`PartialEq` implementations for `OrphanedEnvelope` intentionally ignore
the `payload` field. If two orphaned envelopes have identical `cursor`,
`depends_on`, and `group_id` but different `payload` values, only one
will be stored in the `HashSet`, silently losing the other envelope's
payload data. <b>[ Low confidence ]</b>
- [line
129](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/in_memory_cursor_store.rs#L129):
The `resolve_children` method no longer removes resolved orphans from
the icebox. The old implementation used `extract_if` to remove matching
orphans, but the new implementation only filters and clones without
modifying the icebox. This causes orphaned envelopes to accumulate
indefinitely, leading to memory growth and returning the same
already-resolved orphans on subsequent calls. <b>[ Already posted ]</b>
- [line
155](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/in_memory_cursor_store.rs#L155):
The recursive `resolve_children_inner` function includes all children
whose `depends_on` is not empty (line 155), but orphaned envelopes by
definition always have non-empty `depends_on`. The condition `if
!c.depends_on.is_empty()` at line 155 should likely check something more
specific, such as whether the child's dependencies were actually
satisfied by the input cursors, to avoid unnecessary recursive calls.
<b>[ Low confidence ]</b>
</details>

<details>
<summary>xmtp_api_d14n/src/protocol/order.rs — 0 comments posted, 10
evaluated, 9 filtered</summary>

- [line
83](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L83):
In `recover_lost_children`, the variable `cursors` is shadowed - first
declared as `Vec<Cursor>` on line 68, then redeclared as
`HashSet<Cursor>` on line 83. While not a runtime bug, this could lead
to confusion if the `Vec` version is accidentally used after the
shadowing (currently it's not, but future modifications could introduce
errors). <b>[ Low confidence ]</b>
- [line
134](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L134):
The new `order` implementation changes the ordering from prepending
(`splice(0..0, ...)`) to appending (`append`). The old code placed
resolved envelopes at the front followed by missing envelopes, then
re-ran timestamp sort. The new code appends them at the end without
re-running timestamp sort inside the loop. This ordering change could
affect causal sort correctness if the sort algorithm depends on resolved
dependencies appearing before dependent envelopes in the input list.
<b>[ Low confidence ]</b>
- [line
209](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L209):
In `assert_topic_cursor_seen`, the chained `.unwrap()` calls on line 209
(`cursor.get(&env.topic().unwrap()).unwrap()`) will panic with an
unclear message if either `env.topic()` returns `None` or if the topic
is not present in the cursor. This could produce confusing test failures
since the panic message won't indicate which unwrap failed. Consider
using `expect()` with descriptive messages or propagating the error via
`prop_assert!`. <b>[ Already posted ]</b>
- [line
219](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L219):
In `assert_dependencies_satisfied`, the call `env.topic().unwrap()` on
line 219 will panic if `env.topic()` returns `None`. While similar code
exists in `assert_topic_cursor_seen`, this function uses
`get_or_default` for the cursor lookup (which handles missing topics
gracefully), creating an inconsistency where topic existence is handled
differently than in the sibling helper function. <b>[ Already posted
]</b>
- [line
283](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L283):
Potential deadlock risk in `make_available`: locks are acquired in order
`unavailable` then `available`. If any other code path acquires these
locks in the opposite order (`available` then `unavailable`), a deadlock
could occur. <b>[ Low confidence ]</b>
- [line
285](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L285):
In `make_available`, calling `v.remove(idx)` will panic if `idx >=
v.len()`. There is no bounds check before attempting to remove the
element at the given index from `unavailable`. <b>[ Low confidence ]</b>
- [line
295](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L295):
In `all_envelopes()`, the `unavailable()` envelopes are excluded from
the result. This causes `find_envelope()` to return `None` for any
envelope that is in the unavailable list. In
`has_unavailable_in_dependency_chain()`, if a transitive dependency
itself is in the unavailable list, `find_envelope(&cursor)` won't find
it, so its own dependencies won't be checked, potentially missing
transitive unavailable dependencies in the chain. <b>[ Already posted
]</b>
- [line
392](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L392):
In `orders_with_unresolvable_dependencies`,
`assert_dependencies_satisfied(envelope, &mut topic_cursor)` at line 392
takes a mutable reference to `topic_cursor` inside a loop. If this
function modifies `topic_cursor`, each subsequent iteration operates on
modified state, which could cause inconsistent verification behavior or
false positives/negatives in the test. <b>[ Low confidence ]</b>
- [line
485](https://github.com/xmtp/libxmtp/blob/74be9b416be85024e9af0acee30e53d59a90239b/xmtp_api_d14n/src/protocol/order.rs#L485):
In `orders_with_recovered_children`,
`store.resolve_children(&[newly_available.cursor()])` at line 485 is
called AFTER `ordered.order()` completes at line 476-478. If `order()`
already processed and removed children from the store's orphan/icebox
during ordering, this call may return an empty result even when children
were actually recovered. This would cause `had_children` to be `false`,
skipping the assertions at lines 501-527 and potentially masking bugs in
the ordering logic where children aren't properly recovered. <b>[
Already posted ]</b>
</details>


</details><!-- Macroscope's review summary ends here -->

<!-- Macroscope's pull request summary ends here -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants