Skip to content

Conversation

@insipx
Copy link
Contributor

@insipx insipx commented Dec 9, 2025

if there's an envelope sequence like:

D -> C -> B -> A

where -> indicates a dependency relationship, i.e A -> B means "A depends on B"

a list like:

C A B D

Would return C and D as missing without this fix, since the topic cursor did not yet contain necessary dependencies A and B before processing C, which D then depends on. I thought this would be a non-issue, but in reality it made ordering deeply nested chains of dependencies much more difficult. in this PR, on every dependency we can resolve, a new functions checks if any of the dependencies in Missed can now be processed

@insipx insipx requested a review from a team as a code owner December 9, 2025 20:23
@insipx
Copy link
Contributor Author

insipx commented Dec 9, 2025

@claude
Copy link

claude bot commented Dec 9, 2025

Claude finished @insipx's task —— View job


Code Review - December 10, 2025, 16:21 UTC

Overview

This PR successfully addresses an important limitation in the causal sort algorithm where deeply nested dependency chains within the same batch couldn't be resolved internally. The implementation is sound and well-tested.

✅ Strengths

  1. Elegant solution: The recover_newly_valid approach cleanly solves the within-batch dependency resolution problem
  2. Good test coverage: The new reapplies_within_array property test specifically validates the core fix by shuffling a sorted dependency chain
  3. Performance-conscious: Using extract_if is efficient for filtering and collecting in one pass
  4. Clear naming: Missed<E> struct and recover_newly_valid method names communicate intent well

🔍 Potential Issues

1. Performance consideration (causal.rs:60-62)

The current implementation calls recover_newly_valid after every successfully applied envelope, and immediately splices the results back into the vector being processed. This has quadratic behavior characteristics:

if vector_clock.dominates(&last_seen) {
    self.topic_cursor.apply(env)?;
    let newly_valid = self.recover_newly_valid(&mut missed);  // O(|missed|)
    i += 1;
    self.envelopes.splice(i..i, newly_valid.into_iter());     // O(k) shift
}

For deep dependency chains (e.g., 100 envelopes each depending on the previous), this could result in:

  • recover_newly_valid called 100 times, each scanning the missed list
  • Multiple splice operations shifting elements

Suggested optimization: Consider batching the recovery process. After processing all immediately-valid envelopes in one pass, check for newly-valid ones, then repeat. This would reduce the number of iterations through missed.

2. Unstable API usage (causal.rs:40)

.extract_if(.., |m| {

extract_if is currently unstable (nightly-only feature). This might indicate the codebase is on nightly, but worth noting if stability is a concern.

3. Edge case: Iterator invalidation during splice (causal.rs:62)

The code splices newly recovered envelopes into the current position while iterating:

self.envelopes.splice(i..i, newly_valid.into_iter());

While this should work correctly since you increment i before splicing, it's a subtle pattern. The newly inserted envelopes will be processed in subsequent loop iterations, which is the intended behavior. However, this could benefit from a comment explaining the intentional re-processing of newly inserted envelopes.

📊 Test Coverage Assessment

✅ Both property tests are comprehensive:

  • causal_sort: Tests with randomly removed dependencies (missing envelopes)
  • reapplies_within_array: New test specifically validates the fix by shuffling a fully-resolvable dependency chain

The test assertions properly verify that:

  • Sorted envelopes don't depend on missing/removed ones
  • Missing envelopes actually have unmet dependencies

🔒 Security & Correctness

No security concerns identified. The algorithm maintains causal ordering invariants correctly.

💡 Minor Suggestions

  1. Consider adding a code comment at line 60 explaining why recover_newly_valid is called after each envelope application
  2. The Missed::into_envelope method (line 28-30) could be simplified to just destructure and return, but current form is fine
  3. Consider documenting the time complexity in the function docs for recover_newly_valid

Summary

This is a solid fix that correctly addresses the reported issue. The main consideration is whether the O(n²) behavior in worst-case scenarios (deep dependency chains) is acceptable for your use case. If typical batches are small or dependencies are shallow, the current implementation is perfectly fine. Otherwise, consider the batching optimization mentioned above.

Recommendation: ✅ Approve with the performance consideration noted for future optimization if needed.


@insipx insipx changed the title reapply within causal sort internally resolve envelopes in causal sort Dec 9, 2025
@insipx insipx force-pushed the insipx/causal-fix branch from 417931a to 302503e Compare December 9, 2025 20:29
@macroscopeapp
Copy link
Contributor

macroscopeapp bot commented Dec 9, 2025

Resolve within-batch causal dependencies inside protocol::sort::CausalSort::sort to internally reapply envelopes during sorting in causal.rs

Add Missed<E> to track unmet dependencies, implement CausalSort::recover_newly_valid to extract newly satisfied envelopes, and modify CausalSort::sort to reinsert them during a single pass; update tests to validate within-array reapplication.

📍Where to Start

Start with CausalSort::sort in causal.rs, then review CausalSort::recover_newly_valid and the Missed<E> struct.


Macroscope summarized 302503e.

@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.86%. Comparing base (201e20c) to head (3a8a1e7).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2921      +/-   ##
==========================================
- Coverage   73.86%   73.86%   -0.01%     
==========================================
  Files         397      397              
  Lines       50612    50634      +22     
==========================================
+ Hits        37387    37403      +16     
- Misses      13225    13231       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@insipx insipx merged commit 08ec6fd into main Dec 10, 2025
17 checks passed
@insipx insipx deleted the insipx/causal-fix branch December 10, 2025 19:41
rygine pushed a commit that referenced this pull request Dec 28, 2025
if there's an envelope sequence like:

```
D -> C -> B -> A
```
where `->` indicates a dependency relationship, i.e `A -> B` means "A
depends on B"

a list like:
```
C A B D
```

Would return `C` and `D` as missing without this fix, since the topic
cursor did not yet contain necessary dependencies `A` and `B` before
processing `C`, which `D` then depends on. I thought this would be a
non-issue, but in reality it made ordering deeply nested chains of
dependencies much more difficult. in this PR, on every dependency we can
resolve, a new functions checks if any of the dependencies in `Missed`
can now be processed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants