fix: auto-recover stale network graph cache#765
fix: auto-recover stale network graph cache#765jvsena42 wants to merge 10 commits intofix/node-stopping-bg-paymentsfrom
Conversation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This comment was marked as resolved.
This comment was marked as resolved.
…nto fix/stale-graph
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Note: This is more like a fallback. The source issue must be investigated further |
|
Testing... |
|
Relevant Logs I had to uninstall the previous app because of different signatures |
|
Succeeded on multiple attempts of 4k sats |
|
related LDK PR |
| if (graphNodes.isEmpty()) { | ||
| Logger.debug("Network graph is empty, skipping validation", context = TAG) | ||
| return true |
There was a problem hiding this comment.
idea: consider distinguishing "fresh install / post-reset" from "corrupted graph" here.
Right now an empty graph passes validation unconditionally. After a graph reset + restart, this is expected — RGS hasn't synced yet. But if a graph has a non-null latestRgsSnapshotTimestamp and is still empty, that could indicate corruption rather than a clean slate.
A possible approach:
if (graphNodes.isEmpty()) {
val rgsTimestamp = node.status().latestRgsSnapshotTimestamp
if (rgsTimestamp != null) {
Logger.warn("Network graph is empty despite RGS timestamp $rgsTimestamp", context = TAG)
return false
}
Logger.debug("Network graph is empty (fresh install), skipping validation", context = TAG)
return true
}This would catch the edge case where VSS migration or file corruption produces a graph file that RGS treats as "up to date" but contains no nodes — which is exactly the failure mode we saw on mainnet (0 RGS updates applied on top of a stale VSS snapshot).
Not a blocker — just a defense-in-depth idea for a follow-up.
This PR adds automatic validation and recovery for stale RGS network graph caches that cause Lightning payment routing failures.
Description
When the app uses RGS (Rapid Gossip Sync) delta updates, nodes that are missing from a stale cached graph won't be restored by incremental syncs. This caused
RouteNotFounderrors when trying to pay certain destinations like Blink wallet.validateNetworkGraph()to check if trusted peers (Blocktank LSP nodes) are present in the graphresetNetworkGraph()to delete the cached graph file when stalePreview
pay-to-blink.mp4
QA Notes
1. Verify stale graph auto-recovery
Network graph missing X trusted peersNetwork graph is stale, resetting and restarting...Network graph validated: all X trusted peers present2. Verify normal startup (no stale cache)
Network graph validated: all X trusted peers present3. Verify payment routing works
RouteNotFounderror