-
Notifications
You must be signed in to change notification settings - Fork 74
Readonly Mode #2835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Readonly Mode #2835
Conversation
How to use the Graphite Merge QueueAdd the label mergequeue to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. |
|
I'll analyze this and get back to you. |
8677ed5 to
048f6ba
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2835 +/- ##
==========================================
- Coverage 74.56% 74.44% -0.13%
==========================================
Files 387 387
Lines 49959 49556 -403
==========================================
- Hits 37254 36891 -363
+ Misses 12705 12665 -40 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
70babca to
d3eedac
Compare
| ) -> Result<SyncSummary, GroupError> { | ||
| let mut summary = SyncSummary::default(); | ||
|
|
||
| if self.context.readonly_mode() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to understand a little more about what the callers should do in this situation. We don't really know if the the intent has succeeded or failed. I'd imagine in read only mode we aren't expecting a lot of calls to this (since it's typically triggered by writes), but I want to make sure misuse doesn't lead to someone thinking a thing worked when it didn't or leading to an infinite loop.

Add readonly client mode and route it through builders, context, and sync paths to make registration and rotations no-ops in xmtp_mls
Introduce
ClientMode::Readonly, thread it viaClientBuilder.with_client_mode(...)intoXmtpMlsLocalContext, and gate registration, key rotation, intents, and post-commit syncing behindreadonly_mode(). Update FFI/Node/WASMcreate_clientto accept optional mode and map to coreClientMode. Replace test API selection withTesterApiModeand add targeted readonly tests.📍Where to Start
Start with
ClientModeand context wiring in context.rs, then follow builder propagation in builder.rs and readonly guards in client and group sync paths in client.rs and groups/mls_sync.rs.Changes since #2835 opened
📊 Macroscope summarized b31d94f. 15 files reviewed, 26 issues evaluated, 26 issues filtered, 0 comments posted
🗂️ Filtered Issues
bindings_ffi/benches/create_client.rs — 0 comments posted, 7 evaluated, 7 filtered
unwrap()onident.inbox_id(nonce)can panic at runtime ifIdentifier::inbox_idreturns anErr(e.g., if the generated Ethereum address is deemed invalid byis_valid_address). Replace with proper error handling and surfacing of the failure. [ Out of scope ]tmp_path()(used here) can panic due tostd::env::temp_dir().to_str().unwrap()if the temp directory path is not valid UTF-8 on the host system. Use a fallible conversion or lossless path handling (e.g.,PathBuforto_string_lossy). [ Out of scope ]unwrap()on the result ofxmtpv3::mls::create_client(...).awaitcan panic if client creation fails (e.g., network/backend issues). Prefer handling the error and reporting it within the benchmark. [ Out of scope ]unwrap()onident.inbox_id(nonce)can panic at runtime if identifier validation fails. Replace with fallible handling to keep the benchmark from aborting. [ Out of scope ]tmp_path()here can panic because it usesto_str().unwrap()on the system temp directory path, which may not be UTF-8. Use a fallible or lossy conversion, or accept aPathBuf. [ Out of scope ]unwrap()onxmtpv3::mls::create_client(...).awaitinside the warm-upblock_oncan panic if client creation fails (network/backends). Handle the error instead of aborting the benchmark process. [ Out of scope ]unwrap()onxmtpv3::mls::create_client(...).awaitin the benchmark iteration can panic on backend/client creation failures. Prefer propagating or recording the error instead of panicking. [ Out of scope ]bindings_wasm/src/client.rs — 0 comments posted, 2 evaluated, 2 filtered
Notificationvariant from the#[wasm_bindgen]-exposed enumClientModeis a breaking change for existing JavaScript callers. Any previously valid JS code passingClientMode.Notification(or the corresponding discriminant value) will now be rejected at the wasm boundary, causing a runtime error.wasm_bindgenenforces enum value validity and will throw if an unknown discriminant is provided, so the removal can lead to immediate failures whencreateClientis invoked withclient_modeset to the formerNotificationvalue. This violates contract parity for the public API and can produce runtime exceptions in production. [ Low confidence ]create_client, when anencryption_keyis provided, the code validates it (let _key: EncryptionKey = key.try_into()?) but then discards it and constructs the database withWasmDb::new(&storage_option)without passing or applying the encryption key. As a result, the encryption key is effectively ignored and the store may be created unencrypted even though the caller supplied a key. The error messages suggest an encrypted store is intended, but no encryption configuration is applied to the DB. The key should be used to initialize the database or theEncryptedMessageStoreso that data is actually encrypted. [ Out of scope ]xmtp_mls/src/builder.rs — 0 comments posted, 8 evaluated, 8 filtered
identity_strategyis forced toIdentityStrategy::CachedOnlyrather than preserving any prior strategy. If the original setup expectedCreateIfNotFound, rebuilding may fail to create missing identities. [ Out of scope ]allow_offlineis hardcoded tofalserather than reflecting the existingClientconfiguration. Rebuilt clients may incorrectly disallow offline mode. [ Out of scope ]disable_eventsis set based oncfg(test)rather than preserved from the sourceClient. This can flip event behavior unexpectedly when rebuilding. [ Out of scope ]disable_commit_log_workeris hardcoded tofalserather than reflecting the original client's configuration, potentially enabling a worker unexpectedly. [ Out of scope ]cursor_storeis set toNoneinstead of preserving any existing cursor store from the sourceClient. This can disable or reset cursor-based behavior after rebuild. [ Out of scope ]client_modeis always set toNonerather than preserving theClient's existing mode. This can change runtime behavior when rebuilding from an existing client. [ Low confidence ]EVENTS_ENABLEDis only ever set totrueduring build whendisable_eventsisfalse, but it is never set back tofalsewhendisable_eventsistrue. This leaks global state across client instances: once a prior client enables events, constructing a later client withdisable_events = truewill not actually disable event tracking, violating the caller’s expectation and the macro’s contract for being configurable per client. This is a persistent mutation of global state without scoped restoration, which can cause unexpected event writes and logs. Consider explicitly settingEVENTS_ENABLED.store(false, Ordering::SeqCst)whendisable_eventsistrue, or scoping the flag per-client instead of global. [ Out of scope ]enable_api_statsreturnsClientBuilderError::MissingParameter { parameter: "api_client" }when eitherself.api_clientORself.sync_api_clientisNone. This produces a misleading error message whensync_api_clientis missing, obscuring the actual missing parameter and hindering correct recovery. The guard should differentiate which parameter is missing or report both. [ Out of scope ]xmtp_mls/src/client.rs — 0 comments posted, 4 evaluated, 4 filtered
find_inbox_ids_from_identifierssilently returnsNonefor duplicate identifiers in the input slice because it usesHashMap::removefor bothcached_inbox_idsandnew_inbox_ids. On the first occurrence the entry is removed and returned; on subsequent occurrences of the same identifier,removereturnsNone, causing data loss for valid duplicates. Useget(orget_mutwithout removal) instead to preserve repeated lookups. Problematic lines are wherecached_inbox_ids.remove(&cache_key)andnew_inbox_ids.remove(&ident.into())are used during the final collection. [ Out of scope ]delete_message, events are sent onself.context.local_events()whereas other methods in the same impl (e.g.,create_group,create_dm_by_inbox_id) send onself.local_events. If these senders are not the exact same channel, deletion notifications may not reach subscribers that listen to the client’slocal_events, causing inconsistent externally visible behavior. Use a single, consistent sender (self.local_events) for client-originated local events, or ensure both are the same and document it. [ Out of scope ]register_identity()stores a locally generated key package before verifying the providedSignatureRequest, and on verification failure it returns early without cleaning up the newly stored package. Specifically,generate_and_store_key_package(...)persists state, thenbuild_identity_update().to_verified(...).await?may error, but there is no rollback or deletion of the stored key package in this error path. This leaves an unused key package artifact in local storage, violating the single paired cleanup invariant for effects introduced before verification succeeds. [ Low confidence ]register_identity()performs network-sideupload_key_package(...)(Step 3) beforepublish_identity_update(...)(Step 4). If upload succeeds but publish fails, the function returns an error without compensating actions. This leaves an orphaned key package on the network and an inconsistent local state (no identity stored/ready), violating the effect sequencing/cleanup invariant for partial failure. There is no retry of publish in-place nor rollback of the uploaded package. [ Low confidence ]xmtp_mls/src/groups/intents/queue.rs — 0 comments posted, 1 evaluated, 1 filtered
queue_for_eachassumes allMlsGroup<C>entries share the samecontext, but it only takes thecontextfrom an arbitrary first element and uses that single storage connection to queue intents for every group. Ifgroupscontains entries from different contexts/databases, intents for non-first-context groups will be queued against the wrong store, causing misrouting or partial failure without surfacing an error. There is no validation that all groups have the samecontextnor partitioning by context. Fix by either (a) validating all groups share the samecontextand returning an error otherwise, or (b) partitioning groups bycontextand performing a separate transaction per context. [ Low confidence ]xmtp_mls/src/groups/mls_sync.rs — 0 comments posted, 3 evaluated, 3 filtered
sync_with_connreturnsOk(summary)when only one ofpublish_errorsorpost_commit_errorsis non-empty becauseSyncSummary::is_errored()is defined asself.other.is_some() || (!self.publish_errors.is_empty() && !self.post_commit_errors.is_empty()). This AND condition means a single class of errors (publish-only or post-commit-only) will not trigger an error result, causing callers to treat a partially failed sync as success. This can suppress retries and misreport sync status. The check at lines405..409relies on this method, so the function can incorrectly return success despite errors. [ Out of scope ]elapsed_nsis computed usingnow_ns(line 2432) which castsas_nanos()(u128) toi64. Incommon/src/time.rs,now_ns()doesduration_since_epoch().as_nanos() as i64, which will truncate and wrap once the epoch duration exceedsi64::MAXnanoseconds (circa year 2262). This can produce negative/incorrectnow_ns, causingelapsed_nsto be wrong and suppressing or triggering updates incorrectly. Use a wider type (e.g.,i128oru128) for time arithmetic, or clamp/handle overflow explicitly before comparisons. [ Out of scope ]update_interval_nsis accepted asOption<i64>without validation. If a caller supplies a negative interval, the comparisonelapsed_ns > interval_nson line 2435 will be true for most realisticelapsed_ns, causing near-always updates and potentially excessive DB writes. Reject negatives or clamp to0before use: e.g.,let interval_ns = update_interval_ns.map_or(SYNC_UPDATE_INSTALLATIONS_INTERVAL_NS, |v| v.max(0));. [ Out of scope ]xmtp_mls/src/identity.rs — 0 comments posted, 1 evaluated, 1 filtered
Identity::queue_key_rotationclaims in its docstring "If no key rotation is scheduled, queue it to occur in the next 5 seconds.", but the implementation unconditionally callsconn.queue_key_package_rotation()?without checking whether a rotation is already scheduled. This can result in duplicate scheduling ifconn.queue_key_package_rotationis not idempotent, and the log message "Last key package not ready for rotation, queued for rotation" may be misleading. Either add a guard to check for an existing scheduled rotation before enqueuing, or update the documentation to match behavior. [ Low confidence ]