Skip to content

Conversation

@TheBlueMatt
Copy link
Collaborator

No description provided.

TheBlueMatt and others added 30 commits July 16, 2025 10:56
Various dependencies have had their MSRV bumped since 0.1 was
released. Here we copy the MSRV-compliant pins from upstream so
that CI passes for 0.1 again.
Previously, the `incremental-mutants` CI job was failing on ~every PR that made actual logic changes, and nobody seemed to really make any effort to address the failures. The failing CI jobs therefore just resulted in additional which in turn could have us getting used to failing CI, introducing some risk of acutal failures slipping through. Of course, it also took up some (considerable?) time in the CI queue that might be better spent on other jobs if no contributors are actually benefitting from the CI job.

Here we therefore drop `incremental-mutants` from our CI for the time
being.
…mental-mutants-0.1

Drop `incremental-mutants` CI job (0.1)
Previously, we introduced a chance dynamically determining the base
branch for the check_commits CI job. Unfortunately it used the base_ref
variable, which is only set for pull_requests, not for pushes. Here, we
hence move `check_commits` to a dedicated workflow that only is run on
PRs.
…x-0.1

Move `check_commits` to a dedicated workflow CI (0.1)
If we're doing gossip backfill to a peer, we first forward all our
`channel_announcement`s and `channel_update`s in SCID-order. While
doing so, we don't forward any fresh `channel_update` messages for
any channels which we haven't yet backfilled (as we'll eventually
send the new update anyway, and it might get rejected without the
corresponding `channel_announcement`).

Sadly, our comparison for this was the wrong way, so we actually
*only* forwarded updates which were for channels we haven't yet
backfilled, and dropped updates for channels we already had
backfilled.

Backport of edc7903
If a peer opens a channel to us, but never actually broadcasts the
funding transaction, we'll still keep a `ChannelMonitor` around for
the channel. While we maybe shouldn't do this either, when the
channel ultimately times out 2016 blocks later, we should at least
immediately archive the `ChannelMonitor`, which we do here.

Fixes lightningdevkit#3384

Backport of 744664e

Addressed nontrivial conflicts in:
 * lightning/src/chain/channelmonitor.rs due to splicing changes in
   `Balance` creation upstream

and trivial ones in:
 * lightning/src/ln/functional_tests.rs
During startup, the lightning protocol forces us to fetch a ton of
gossip for channels where there is a `channel_update` in only one
direction. We then have to wait around a while until we can prune
the crap cause we don't know when the gossip sync has completed.

Sadly, doing a large prune via `remove_stale_channels_and_tracking`
is somewhat slow. Removing a large portion of our graph currently
takes a bit more than 7.5 seconds on an i9-14900K, which can
ultimately ~hang a node with a few less GHz ~forever.

The bulk of this time is in our `IndexedMap` removals, where we
walk the entire `keys` `Vec` to remove the entry, then shift it
down after removing.

Here we shift to a bulk removal model when removing channels, doing
a single `Vec` iterate + shift. This reduces the same test to
around 1.38 seconds on the same hardware.

Backport of cc028f4 without
conflicts.
During startup, the lightning protocol forces us to fetch a ton of
gossip for channels where there is a `channel_update` in only one
direction. We then have to wait around a while until we can prune
the crap cause we don't know when the gossip sync has completed.

Sadly, doing a large prune via `remove_stale_channels_and_tracking`
is somewhat slow. Removing a large portion of our graph currently
takes a bit more than 7.5 seconds on an i9-14900K, which can
ultimately ~hang a node with a few less GHz ~forever.

The bulk of this time is in our `IndexedMap` removals, where we
walk the entire `keys` `Vec` to remove the entry, then shift it
down after removing.

In the previous commit we shifted to a bulk removal model for
channels, here we do the same for nodes. This reduces the same test
to around 340 milliseconds on the same hardware.

Backport of 28b526a

Resolved trivial `use` conflicts in:
 * lightning/src/routing/gossip.rs
Previously, we had a bug that particularly affected async payments where if an
outbound payment was in the state {Static}InvoiceReceived and there was a call
to process_pending_htlc_forwards, the payment would be automatically abandoned.
We would behave correctly and avoid abandoning if the payment was awaiting an
invoice, but not if the payment had an invoice but the HTLCs weren't yet locked
in.

Backport of ebb8e79

Resolved trivial conflicts in:
 * lightning/src/ln/outbound_payment.rs

and removed changes in
 * lightning/src/ln/async_payments_tests.rs as it doesn't exist in this
   branch.
Note:
The `actions_blocking_raa_monitor_updates` list may contain stale entries
in the form of `(channel_id, [])`, which do not represent actual dangling actions.

To handle this, stale entries are ignored when accumulating pending actions
before clearing them. This ensures that the logic focuses only on relevant
actions and avoids unnecessary accumulation of already processed data.

Backport of 86a0109
Co-authored by: Matt Corallo <649246+TheBlueMatt@users.noreply.github.com>

Backport of 45aa824 as a
prerequisite to backporting
583a9a3.

`use` conflicts resolved in:
 * lightning/src/ln/chanmon_update_fail_tests.rs
In 9cc6e08, we started using the
`RAAMonitorUpdateBlockingAction` logic to block RAAs which may
remove a preimage from one `ChannelMonitor` if it isn't durably
stored in another that is a part of the same MPP claim.

Then, in 254b78f, when we claimed
a payment, if we saw that the HTLC was already claimed in the
channel, we'd simply drop the new RAA blocker. This can happen on
reload when replaying MPP claims.

However, just because an HTLC is no longer present in
`ChannelManager`'s `Channel`, doesn't mean that the
`ChannelMonitorUpdate` which stored the preimage actually made it
durably into the `ChannelMonitor` on disk.

We could begin an MPP payment, have one channel get the preimage
durably into its `ChannelMonitor`, then step forward another update
with the peer. Then, we could reload, causing the MPP claims to be
replayed across all chanels, leading to the RAA blocker(s) being
dropped and all channels being unlocked. Finally, if the first
channel managed to step forward a further update with its peer
before the (now-replayed) `ChannelMonitorUpdate`s for all MPP parts
make it to disk we could theoretically lose the preimage.

This is, of course, a somewhat comically unlikely scenario, but I
had an old note to expand the test and it turned up the issue, so
we might as well fix it.

Backport of 583a9a3

Resolved conflicts in:
 * lightning/src/ln/chanmon_update_fail_tests.rs
 * lightning/src/ln/channelmanager.rs
In 0.1 we started requiring `counterparty_node_id` to be filled in
in various previous-hop datastructures when claiming HTLCs. While
we can't switch `HTLCSource`'s
`HTLCPreviousHopData::counterparty_node_id` to required (as it
might cause us to fail to read old `ChannelMonitor`s which still
hold `HTLCSource`s we no longer need to claim), we can at least
start requiring the field in `PendingAddHTLCInfo` and
`HTLCClaimSource`. This simplifies `claim_mpp_part` marginally.

Backport of f624047

Addressed substantial conflicts in:
 * lightning/src/ln/channelmanager.rs
When we claim a payment, `Event::PaymentClaimed` contains a list of
the HTLCs we claimed from as `ClaimedHTLC` objects. While they
include a `channel_id` the pyment came to us over, in theory
`channel_id`s aren't guaranteed to be unique (though in practice
they are in all opened channels aside from 0conf ones with a
malicious counterparty). Further, our APIs often require passing
both the `counterparty_node_id` and the `channel_id` to do things
to chanels.

Thus, here we add the missing `counterparty_node-id` to
`ClaimedHTLC`.

Backport of 7c735d9
Historically we indexed channels by
`(counterparty_node_id, funding outpoint)` in several pipelines,
especially the `ChannelMonitorUpdate` pipeline. This ended up
complexifying quite a few things as we always needed to store the
full `(counterparty_node_id, funding outpoint, channel_id)` tuple
to ensure we can always access a channel no matter its state.

Over time we want to move to only the
`(counterparty_node_id, channel_id)` tuple as *the* channel index,
especially as we move towards V2 channels that have a
globally-unique `channel_id` anyway.

Here we take one small step towards this, avoiding using the
channel funding outpoint in the `EventCompletionAction` pipeline.

Backport of 10df89d

Resolved conflicts in:
 * lightning/src/ln/channel.rs
 * lightning/src/ln/channelmanager.rs
We added the ability to block `ChannelMonitorUpdate`s on receipt of
an RAA in order to avoid dropping a payment preimage from a channel
that created a `PaymentSent` event in
9ede794. We did not at the time
use the same infrastructure for `PaymentClaimed` events, but really
should have. While a `PaymentClaimed` event may seem a bit less
critical than a `PaymentSent` event (it doesn't contain a payment
preimage that the user needs to make sure they store for proof of
payment), its still important for users to ensure their payment
tracking logic is always correct.

Here we take the (relatively straightforward) action of setting a
`EventCompletionAction` to block RAA monitor updates on channels
which created a `PaymentClaimed` event. Note that we only block one
random channel from an MPP paymnet, not all of them, as any single
channel should provide enough information for us to recreate the
`PaymentClaimed` event on restart.

Backport of a80c855

Resolved conflicts in:
 * lightning/src/ln/async_signer_tests.rs due to changes only being
   on tests which don't exist on this branch,
 * lightning/src/ln/chanmon_update_fail_tests.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/quiescence_tests.rs due to file not being
   present in this branch.

Resolved silent conflicts in:
 * lightning/src/ln/monitor_tests.rs due to slightly different test
   APIs.
`test_dup_htlc_onchain_doesnt_fail_on_reload` made reference to
`ChainMonitor` persisting `ChannelMonitor`s on each new block,
which hasn't been the case in some time. Instead, we update the
comment and code to make explicit that it doesn't impact the test.

Backport of 0a6c3fb

Resolved `use` conflicts in:
 * lightning/src/ln/payment_tests.rs
During testsing, we check that a `ChannelMonitor` will round-trip
through serialization exactly. However, we recently added a fix to
change a value in `PackageTemplate` on reload to fix some issues in
the field in 0.1. This can cause the round-trip tests to fail as a
field is modified during read.

We fix it here by simply exempting the field from the equality test
in the condition where it would be updated on read.

We also make the `ChannelMonitor` `PartialEq` trait implementation
non-public as weird workarounds like this make clear that such a
comparison is a britle API at best.

Backport of a8ec966

Resolved `use` and `rustfmt` conflicts in:
 * lightning/src/chain/channelmonitor.rs

In the upstream version of this commit, the `PartialEq`
implementation for `ChannelMonitor` was made test-only however to
avoid breaking a public API we do not do so here. However, the
changes to `PartialEq` for `PackageTemplate` are `test`-only, so
it shouldn't result in any behavioral change (not that the marginal
`PartialEq` changes are likely to impact downstream crates in any
case).
On `ChannelManager` reload we rebuild the pending outbound payments
list by looking for any missing payments in `ChannelMonitor`s.
However, in the same loop over `ChannelMonitor`s, we also re-claim
any pending payments which we see we have a payment preimage for.

If we send an MPP payment across different chanels, the result may
be that we'll iterate the loop, and in each iteration add a
pending payment with only one known path, then claim/fail it and
remove the pending apyment (at least for the claim case). This may
result in spurious extra events, or even both a `PaymentFailed` and
`PaymentSent` event on startup for the same payment.

Backport of 8106dbf

Resolved substantial conflcits in:
 * lightning/src/ln/channelmanager.rs by simply rewriting the
   patch.

Note that the `is_channel_closed` variable used in the upstream
version of this commit replaced simply checking if the
`outpoint_to_peer` map had an entry for the channel's funding
outpoint.

Note that the next upstream commit in this series
(3239d67) is unnecessary on this
branch as we use `outpoint_to_peer` rather than `per_peer_state` to
detect channel closure here.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fashion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

When we restart and, during `ChannelManager` load, see a
`ChannelMonitor` for a closed channel, we scan it for preimages
that we passed to it and re-apply those to any pending or forwarded
payments. However, we didn't scan it for preimages it learned from
transactions on-chain. In cases where a `MonitorEvent` is lost,
this can lead to a lost preimage. Here we fix it by simply tracking
preimages we learned on-chain the same way we track preimages
picked up during normal channel operation.

Backport of 543cc85

Resolved conflicts in
 * lightning/src/ln/monitor_tests.rs due to trivial changes
   upstream as well as changes to upstream bump events and
   commitment announcement logic.
Backport of 9186900

Resolved trivial conflicts in:
 * lightning/src/chain/channelmonitor.rs due to splicing's
   introduction of the `funding` field in monitors.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fashion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In a previous commit we handled the case of claimed HTLCs by
replaying payment preimages on startup to avoid `MonitorEvent` loss
causing us to miss an HTLC claim. Here we handle the HTLC-failed
case similarly.

Unlike with HTLC claims via preimage, we don't already have replay
logic in `ChannelManager` startup, but its easy enough to add one.
Luckily, we already track when an HTLC reaches permanently-failed
state in `ChannelMonitor` (i.e. it has `ANTI_REORG_DELAY`
confirmations on-chain on the failing transaction), so all we need
to do is add the ability to query for that and fail them on
`ChannelManager` startup.

Backport of f809e6c

Resolved conflicts in:
 * lightning/src/chain/channelmonitor.rs due to splicing-related
   changes in the upstream branch,
 * lightning/src/ln/channelmanager.rs due to lack of the
   `LocalHTLCFailureReason` type in this branch, and
 * lightning/src/ln/monitor_tests.rs due to changes to upstream
   bump events and commitment announcement logic.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we take the first step towards that notification, adding a new
`ChannelMonitorUpdateStep` for the completion notification, and
tracking HTLCs which make it to the `ChannelMonitor` in such
updates in a new map.

Backport of c49ce57

Trivial conflicts resolved in:
 * lightning/src/chain/channelmonitor.rs
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we prepare to generate the new
`ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, adding
a new `PaymentCompleteUpdate` struct to track the new update before
we generate the `ChannelMonitorUpdate` and passing through to the
right places in `ChannelManager`.

The only cases where we want to generate the new update is after a
`PaymentSent` or `PaymentFailed` event when the event was the
result of a `MonitorEvent` or the equivalent read during startup.

Backport of 8b637cc

Conflicts resolved in:
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/outbound_payment.rs
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we begin generating the new
`ChannelMonitorUpdateStep::ReleasePaymentComplete` updates,
updating functional tests for the new `ChannelMonitorUpdate`s where
required.

Backport of 71a364c

Conflicts resolved in:
 * lightning/src/ln/chanmon_update_fail_tests.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/functional_test_utils.rs
 * lightning/src/ln/functional_tests.rs
 * lightning/src/ln/monitor_tests.rs
 * lightning/src/ln/outbound_payment.rs
 * lightning/src/ln/payment_tests.rs

Note that unlike the original commit, on this branch we do not fail
to deserialize a `ChannelMonitor` if the `counterparty_node_id` is
`None` (implying it has not seen a `ChannelMonitorUpdate` since
LDK 0.0.118). Thus, we skip the new logic in some cases, generating
a warning log instead.

As we assumed that it is now reasonable to require
`counterparty_node_id`s in LDK 0.2, it seems reasonable to skip the
new logic (potentially generating some additional spurious payment
events on restart) now here as well.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we, finally, begin actually using the new
`ChannelMonitorUpdateStep::ReleasePaymentComplete` updates,
skipping re-hydration of pending payments once they have been fully
resolved through to a user `Event`.

Backport of 226520b

Fixed conflicts in:
 * lightning/src/chain/channelmonitor.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/functional_tests.rs to address differences
   causing upstream to (somewhat spuriously) generate
   `ChannelMonitorUpdate`s for channels force-closed by commitment
   transaction confirmation whereas this branch does not,
 * lightning/src/ln/payment_tests.rs
TheBlueMatt and others added 25 commits October 10, 2025 13:01
When a payment was sent and ultimately completed through an
on-chain HTLC claim which we discover during startup, we
deliberately break the payment tracking logic to keep it around
forever, declining to send a `PaymentPathSuccessful` event but
ensuring that we don't constantly replay the claim on every
startup.

However, now that we now have logic to complete a claim by marking
it as completed in a `ChannelMonitor` and not replaying information
about the claim on every startup. Thus, we no longer need to take
the conservative stance and can correctly replay claims now,
generating `PaymentPathSuccessful` events and allowing the state to
be removed.

Backport of ba6528f

Fixed conflicts in:
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/payment_tests.rs
We plan to rework this method in the following commit, so we start by
formatting it first.

Backport of 2fbaf96

Conflicts resolved in:
 * lightning/src/chain/channelmonitor.rs
Previously, we'd attempt to claim all HTLCs that have expired or that we
have the preimage for on each preimage monitor update. This happened due
to reusing the code path (`get_counterparty_output_claim_info`) used
when producing all claims for a newly confirmed counterparty commitment.
Unfortunately, this can result in invalid claim transactions and
ultimately in loss of funds (if the HTLC expires and the counterparty
claims it via the timeout), as it didn't consider that some of those
HTLCs may have already been claimed by a separate transaction.

This commit changes the behavior when handling preimage monitor updates
only. We will now only attempt to claim HTLCs for the specific preimage
that we learned via the monitor update. This is safe to do, as even if a
preimage HTLC claim transaction is reorged out, the `OnchainTxHandler`
is responsible for continuous claiming attempts until we see a reorg of
the corresponding commitment transaction.

Backport of 46f0d31

Conflicts resolved in:
 * lightning/src/chain/channelmonitor.rs

Silent conflicts resolved in:
 * lightning/src/ln/monitor_tests.rs
This reverts commit 3968d5e.

Not sure how this slipped in but it breaks semver (a new field in a
public struct) for a field that isn't used, so should be dropped.
These have been merged, causing our docs.rs builds to fail. Sadly,
we saw our docs.rs build fail for the 0.1.6 upload because of this.

Backport of 6aea586

Resolved conflicts in:
 * lightning-background-processor/src/lib.rs
 * lightning-liquidity/src/lib.rs
 * lightning/src/lib.rs
Sadly, our docs.rs build failed for 0.1.6 due to the unstable
`doc_auto_cfg` feature being removed. Here, we try to emulate
docs.rs builds as best we can in CI to head that off in the future.

Backport of d3faa03
In our upgrade/downgrade tests we want to be able to use the
functional test utils as-is externally as a part of CI in later
LDK versions. This implies that code that is `cfg(test)` really
needs to always be `cfg(any(test, feature = "_test_utils"))`.

Sadly, trying to open multiple channels back-to-back on a node
with `_test_utils` currently fails as we re-broadcast channel
announcement messages, something the functional test utils does not
expect.

Here we update the `cfg`-flagging of that rebroadcast logic to fix
back-to-back channel opens in `_test_utils`.

This narrowly backports one hunk of the much larger upstream
commit e41e756.
Backports the documentation changes from
2ce8e64
`Duration::new` adds any nanoseconds in excess of a second to the
second part. This can overflow, however, panicking. In 0.2 we
introduced a few further cases where we store `Duration`s,
specifically some when handling network messages.

Sadly, that introduced a remotely-triggerable crash where someone
can send us, for example, a malicious blinded path context which
can cause us to panic.

Found by the `onion_message` fuzzer

Backport of 7b9bde1
If we complete a `ChannelMonitorUpdate` persistence but there are
blocked `ChannelMonitorUpdate`s in the channel, we'll skip all the
post-monitor-update logic entirely. While its correct that we can't
resume the channel (as it expected the monitor updates it generated
to complete, even if they ended up blocked), the post-update
actions are a `channelmanager.rs` concept - they cannot be tied to
blocked updates because `channelmanager.rs` doesn't even see
blocked updates.

This can lead to a channel getting stuck waiting on itself. In a
production environment, an LDK user saw a case where:
 (a) an MPP payment was received over several channels, let's call
     them A + B.
 (b) channel B got into `AwaitingRAA` due to unrelated operations,
 (c) the MPP payment was claimed, with async monitor updating,
 (d) the `revoke_and_ack` we were waiting on was delivered, but the
     resulting `ChannelMonitorUpdate` was blocked due to the
     pending claim having inserted an RAA-blocking action,
 (e) the preimage `ChannelMonitorUpdate` generated for channel B
     completed persistence, which did nothing due to the blocked
     `ChannelMonitorUpdate`.
 (f) the `Event::PaymentClaimed` event was handled but it, too,
     failed to unblock the channel.

Instead, here, we simply process post-update actions when an update
completes, even if there are pending blocked updates. We do not
fully unblock the channel, of course.

Backport of 8f4a4d2

Fixed conflicts in:
 * lightning/src/ln/chanmon_update_fail_tests.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/functional_test_utils.rs
…-update-even-when-blocked-0.1

[0.1] Handle mon update completion actions even with update(s) is blocked
v0.1.8 - Dec 2, 2025 - "Async Update Completion"

Bug Fixes
=========

 * In cases where an MPP payment is claimed while one channel is waiting on a
   counterparty's `revoke_and_ack` message and the `revoke_and_ack` message is
   received prior to the asynchronous completion of the MPP-claim
   `ChannelMonitorUpdate`, the channel will no longer hang (lightningdevkit#4236).
 * Deserializing invalid `Duration`s can no longer panic (lightningdevkit#4172).
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Dec 6, 2025

I've assigned @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants