Skip to content

Conversation

@QuintenQVD0
Copy link
Contributor

@QuintenQVD0 QuintenQVD0 commented Nov 16, 2025

This pull request enhances the server transfer functionality by adding robust support for streaming backups and install logs, along with improved checksum verification for both archives and backups. The changes ensure data integrity during transfers and provide better error handling and progress reporting.

Backup and Install Log Streaming:

  • Added new methods StreamBackups and StreamInstallLogs to the Archive type in server/transfer/archive.go, enabling backups and install logs to be streamed as multipart form data during server transfers. Each backup is streamed with its own checksum for integrity verification, and install logs are included if present. [1] [2]

Checksum Verification Improvements:

  • Updated the transfer handler in router/router_transfer.go to verify checksums for both the main archive and each backup file after streaming. The handler now aborts the transfer if any checksum does not match, ensuring data integrity.

Transfer Workflow Enhancements:

  • Modified the transfer process in server/transfer/source.go to stream the main archive, then backups and install logs, each with their own checksum. Progress messages for backup streaming and install logs are sent to the websocket for better visibility. [1] [2] [3]

Error Handling and Logging:

  • Improved error handling for file operations, such as creating backup directories and install log files, ensuring that failures in optional components (like install logs) do not abort the entire transfer. Detailed debug and warning logs are added for better traceability.

Cleanup Logic:

  • Refined the server deletion logic in router/router_server.go to only warn if the install log file removal fails for reasons other than the file not existing, reducing unnecessary warnings.

Summary by CodeRabbit

  • New Features

    • Per-backup streaming with per-backup SHA-256 checksums and verification
    • Streaming of server install logs as part of transfers
    • Option to specify which backups to include in a transfer
  • Bug Fixes

    • Improved handling of missing/unreadable install logs and backups so transfers continue when appropriate
    • Clearer checksum verification failures that abort on mismatch
    • Ignore missing install-log file errors when deleting logs

✏️ Tip: You can customize this high-level summary in your review settings.

@QuintenQVD0 QuintenQVD0 requested a review from a team as a code owner November 16, 2025 18:04
@coderabbitai
Copy link

coderabbitai bot commented Nov 16, 2025

Walkthrough

Adds multipart streaming and verification for server transfers: archive, per-backup files, and install logs are streamed with on-the-fly SHA-256 checksums and verified after receipt. Router delete logic now ignores missing install-log files when removing them.

Changes

Cohort / File(s) Summary
Router: multipart receive & disk streaming
router/router_transfer.go
Replaces exact-name switch with conditional multipart dispatch; streams archive and backup parts to disk while computing per-part SHA-256; stores received checksums; verifies archive and backups after streaming; treats optional parts (install logs, missing backups) non-fatally; adds config/filepath imports.
Router: server deletion tweak
router/router_server.go
deleteServer now ignores ENOENT when removing install-log files and only logs warnings for other removal errors.
Server: archive streaming helpers
server/transfer/archive.go
Adds StreamBackups(ctx, mp) and StreamInstallLogs(ctx, mp) to stream backups and install logs via multipart writer, compute per-backup SHA-256 checksums, skip absent files, and publish progress; adds transfer *Transfer and backupsStreamed int to Archive.
Server: source pipeline & fields
server/transfer/source.go, server/transfer/transfer.go
source.go: switches progress reporting to use Archive, uses a dedicated main archive hasher, calls StreamBackups and StreamInstallLogs, renames/form-field checksum_archive, and updates status/log messages. transfer.go: adds public BackupUUIDs []string field to Transfer.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Router as router_transfer
    participant Archive as Archive (server)
    participant FS as Filesystem

    Client->>Router: multipart POST (archive, checksum_archive, backup_*, checksum_backup_*, install_logs)
    activate Router

    alt archive part
        Router->>Archive: stream archive (tee -> sha256)
        Archive->>FS: write/archive to destination
        Router->>Router: record calculated archive checksum
    end

    alt checksum_archive part
        Client->>Router: send expected archive checksum
        Router->>Router: record expected archive checksum
    end

    loop per backup
        Router->>Archive: stream backup_<id> (write to disk + sha256)
        Archive->>FS: write backup file under server dir
        Router->>Router: record calculated backup checksum
        Client->>Router: send checksum_backup_<id>
        Router->>Router: record expected backup checksum
    end

    alt install_logs part
        Router->>Archive: stream install_logs
        Archive->>FS: write install log file (skip if missing)
    end

    Router->>Router: verify archive & each backup checksums
    alt checksums match
        Router->>Archive: CreateEnvironment()
        Router->>Client: respond success
    else mismatch
        Router->>Client: respond error (checksum mismatch)
    end
    deactivate Router
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review focus:
    • router/router_transfer.go: multipart dispatch, streaming-to-disk, checksum maps, error handling and verification.
    • server/transfer/archive.go: streaming helpers, per-backup checksum computation, progress updates.
    • server/transfer/source.go: integration of streaming steps and status flow.
    • router/router_server.go: ENOENT handling when deleting install logs.

Suggested reviewers

  • parkervcp

Poem

🐰 I hopped through streams both fast and neat,
Sha256 footprints beneath my feet,
Backups nested, bytes in tune,
Install logs tucked by silver moon,
A rabbit sings: checksums all complete! 🎉

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Transfer backups and install logs' directly matches the main objective of the PR, which is to add streaming support for backups and install logs.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/transfer/source.go (2)

70-78: Unbuffered errChan can leak the streaming goroutine on HTTP errors or early returns

errChan is unbuffered and only read after client.Do succeeds and res.StatusCode == http.StatusOK. If client.Do fails, or you return early on a non-200 status, PushArchiveToTarget exits without ever reading from errChan, while the inner goroutine may still try to send an error, blocking forever.

Since only a single error is ever sent, you can fix this safely by making the channel buffered:

-	// Create a new goroutine to write the archive to the pipe used by the
-	// multipart writer.
-	errChan := make(chan error)
+	// Create a new goroutine to write the archive to the pipe used by the
+	// multipart writer. Buffered to avoid blocking if we return early.
+	errChan := make(chan error, 1)

Optionally, you can also rely on the deferred cancel()/cancel2() and explicitly document that the goroutine must never block on sending more than one value, but the buffered channel is the simplest, robust fix.

Also applies to: 150-187


160-173: Multipart error handling logs and wraps the wrong error variable

Inside the select you check err2 := <-errChan, but still log/wrap err (the HTTP request error, which is guaranteed to be nil here):

if err != nil || err2 != nil {
    if err == context.Canceled {
        return nil, err
    }

    t.Log().WithError(err).Debug("failed to send archive to destination")
    return nil, fmt.Errorf("http error: %w, multipart error: %v", err, err2)
}

At this point err is always nil, so you lose the actual multipart error (err2) in logs and in the returned error.

A more accurate handling would be:

-	if err != nil || err2 != nil {
-		if err == context.Canceled {
-			return nil, err
-		}
-
-		t.Log().WithError(err).Debug("failed to send archive to destination")
-		return nil, fmt.Errorf("http error: %w, multipart error: %v", err, err2)
-	}
+	if err2 != nil {
+		t.Log().WithError(err2).Debug("failed to send archive to destination")
+		return nil, fmt.Errorf("multipart error: %w", err2)
+	}

This preserves the real failure cause and avoids misleading logs.

🧹 Nitpick comments (2)
server/transfer/archive.go (2)

37-120: Backup streaming logic is solid; consider optional context-aware cancellation

The backup streaming loop correctly:

  • Filters .tar.gz files,
  • Streams each backup via io.Copy while hashing with sha256,
  • Emits a per-backup checksum field, and
  • Logs/send messages for progress.

Two optional improvements:

  • Honor ctx to allow early aborts during long backup transfers (e.g., check ctx.Err() between files and bail appropriately).
  • Use defer file.Close() inside the per-backup loop for slightly clearer resource handling (current explicit closes are correct but more error-prone to maintain).

Not blocking for merge, but worth considering for large backup sets.


122-162: Install log streaming correctly treated as best-effort; context support is optional

StreamInstallLogs:

  • Uses the same log path convention as the rest of the codebase.
  • Treats missing or unreadable install logs as non-fatal (logs a warning and returns nil), which matches the intended “optional” semantics.
  • Properly closes the file and streams via io.Copy.

If you want to improve it further, you could:

  • Check ctx.Err() before starting the copy to avoid work after cancellation.
  • Optionally log an info message when logs are skipped due to missing file so operators can distinguish “no logs” vs “failure”.

Current behaviour is correct; these are nice-to-have refinements.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96a1251 and 1c25ca0.

📒 Files selected for processing (4)
  • router/router_server.go (1 hunks)
  • router/router_transfer.go (3 hunks)
  • server/transfer/archive.go (3 hunks)
  • server/transfer/source.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
router/router_transfer.go (5)
server/server.go (2)
  • Server (30-81)
  • New (85-108)
router/middleware/middleware.go (1)
  • CaptureAndAbort (60-63)
server/filesystem/filesystem.go (2)
  • New (37-54)
  • Filesystem (24-34)
server/transfer/transfer.go (1)
  • New (59-69)
config/config.go (1)
  • Get (421-429)
server/transfer/archive.go (4)
server/filesystem/archive.go (1)
  • Archive (58-78)
config/config.go (1)
  • Get (421-429)
server/filesystem/filesystem.go (1)
  • New (37-54)
server/transfer/transfer.go (2)
  • New (59-69)
  • Transfer (43-56)
server/transfer/source.go (4)
server/transfer/archive.go (1)
  • Archive (165-169)
server/filesystem/archive.go (1)
  • Archive (58-78)
internal/progress/progress.go (1)
  • Progress (13-21)
server/transfer/transfer.go (1)
  • New (59-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.24.7, linux, amd64)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.25.1, linux, amd64)
  • GitHub Check: Analyze (go)
🔇 Additional comments (3)
router/router_server.go (1)

250-255: Install log deletion now correctly treats missing file as non-fatal

Guarding the warning with !os.IsNotExist(err) makes server deletion idempotent and avoids noisy logs when the install log was already removed. No further changes needed here.

router/router_transfer.go (2)

149-199: Archive streaming + checksum verification flow looks good

The new archive / checksum_archive handling:

  • Streams directly into ExtractStreamUnsafe while teeing into a SHA-256 hasher.
  • Stores the calculated checksum and later compares it to the separately streamed checksum_archive field.
  • Fails fast on missing checksum or mismatch, with useful logging.

This gives a clear, single-source-of-truth hash for the main archive and integrates well with the rest of the transfer flow.

Also applies to: 301-320


208-245: Install logs handling correctly treats failures as non-fatal

The new install_logs case:

  • Creates <LogDirectory>/install if needed.
  • Writes logs to <serverID>.log.
  • Logs warnings but does not abort the transfer on directory/file creation or write errors.

This aligns well with the “optional install logs” intent and keeps transfer success dependent only on core data (archive/backups), not ancillary logs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
router/router_transfer.go (1)

209-246: LGTM: Install logs handled gracefully.

Non-fatal error handling ensures that install log failures don't abort the entire transfer. File handles are closed properly on error paths.

Note: If io.Copy fails (line 233), a partial install log file remains on disk. This is likely acceptable for diagnostic logs, but you could optionally os.Remove(installLogPath) after the installLogFile.Close() on line 234 if you prefer cleanup of incomplete artifacts.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c25ca0 and a07ed0e.

📒 Files selected for processing (1)
  • router/router_transfer.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
router/router_transfer.go (4)
router/middleware/middleware.go (1)
  • CaptureAndAbort (60-63)
server/filesystem/filesystem.go (2)
  • New (37-54)
  • Filesystem (24-34)
internal/ufs/filesystem.go (1)
  • Filesystem (11-168)
config/config.go (1)
  • Get (421-429)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.24.7, linux, amd64)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.25.1, linux, amd64)
🔇 Additional comments (7)
router/router_transfer.go (7)

14-14: LGTM: Imports added for new functionality.

The filepath and config imports support the new backup directory and install log path construction.

Also applies to: 21-21


149-155: Excellent fix: Checksum verification bug resolved.

The split maps (backupChecksumsCalculated vs backupChecksumsReceived) correctly address the critical issue flagged in prior reviews where a single map caused self-comparison, preventing detection of backup corruption or tampering.


175-198: LGTM: Archive streaming with in-flight checksum calculation.

The TeeReader pattern efficiently calculates the SHA256 checksum during extraction without requiring additional passes or buffering.


247-287: LGTM: Backup streaming with proper checksum tracking.

The backup streaming correctly:

  • Calculates checksums in-flight using TeeReader
  • Stores calculated checksums in the dedicated map
  • Closes file handles on both success and error paths
  • Treats backup errors as fatal, appropriately aborting the transfer

The path traversal concern for backupName (line 248) was previously discussed and dismissed since the multipart request originates only from trusted Wings instances.


199-208: LGTM: Checksum reception properly separated.

Archive and backup checksums are correctly stored in dedicated "received" variables/maps, maintaining clear separation from calculated checksums for subsequent verification.

Also applies to: 288-298


302-341: Excellent: Checksum verification logic is now correct.

The verification correctly:

  • Compares archive calculated vs received checksums (lines 310-317)
  • Iterates over calculated backup checksums and verifies each against its received counterpart (lines 323-340)
  • Detects missing checksums for received backups (line 325)
  • Aborts on any mismatch with detailed logging

This implementation properly fixes the critical bug and will now detect corruption or tampering in transferred backups.


343-365: LGTM: Completion logic ensures transfer integrity.

The final checks ensure the archive was received (line 343), environment setup succeeds (line 353), and the success flag is set only when all verification passes. The deferred cleanup function properly handles both success and failure cases.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
server/transfer/archive.go (1)

138-145: Ensure Server.ID() cannot introduce path traversal in backup/log paths

Both backup and install‑log paths are built via:

backupPath := filepath.Join(config.Get().System.BackupDirectory, a.transfer.Server.ID())
installLogPath := filepath.Join(config.Get().System.LogDirectory, "install", a.transfer.Server.ID()+".log")

If Server.ID() is ever derived from untrusted input and allowed to contain path separators or .., this could enable directory traversal outside the intended backup/log roots. If there isn’t already a central guarantee, consider normalizing/validating IDs (e.g., restricting to a safe character set) before using them in filesystem paths.

🧹 Nitpick comments (3)
server/transfer/source.go (2)

78-145: Archive + backup + install‑log streaming pipeline is functionally sound but end‑error handling can be simplified

The main archive pipe + io.TeeReader SHA‑256 hashing, followed by conditional StreamBackups and StreamInstallLogs, all look correct and match the intended multipart shape. A couple of minor cleanups would make this easier to reason about:

  • At the bottom, once client.Do has succeeded, err will always be nil, so in the select block only errChan’s value matters. You can simplify the combination of err/err2 and avoid logging a nil HTTP error.

For example:

-	select {
-	case <-ctx.Done():
-		return nil, ctx.Err()
-	case err2 := <-errChan:
-		t.Log().Debug("stream completed")
-		if err != nil || err2 != nil {
-			if err == context.Canceled {
-				return nil, err
-			}
-
-			t.Log().WithError(err).Debug("failed to send archive to destination")
-			return nil, fmt.Errorf("http error: %w, multipart error: %v", err, err2)
-		}
+	select {
+	case <-ctx.Done():
+		return nil, ctx.Err()
+	case err2 := <-errChan:
+		t.Log().Debug("stream completed")
+		if err2 != nil {
+			t.Log().WithError(err2).Debug("failed to send archive to destination")
+			return nil, fmt.Errorf("multipart error: %w", err2)
+		}
 	}

This keeps responsibility clear: transport errors are handled immediately after client.Do, while multipart/streaming errors flow exclusively through errChan.


140-145: StreamInstallLogs call path is more defensive than needed

Archive.StreamInstallLogs already treats all internal failures as non‑fatal and always returns nil, so this check:

if err := a.StreamInstallLogs(ctx, mp); err != nil {
    errChan <- fmt.Errorf("failed to stream install logs: %w", err)
    return
}

will never fire. You can either (a) drop the if err != nil check and ignore the return value, or (b) have StreamInstallLogs actually propagate errors you want to treat as fatal and keep this branch. Right now it reads as if install‑log failures abort the transfer, but they don’t in practice.

server/transfer/archive.go (1)

37-132: Backup streaming behavior matches intent; consider minor robustness tweaks

The overall flow in StreamBackups looks good:

  • Filters on BackupUUIDs and matches UUID.tar.gz on disk.
  • Skips cleanly when the directory is missing or no matches are found.
  • Streams each file with a per‑backup SHA‑256 checksum field, and emits clear per‑backup progress messages.

A couple of small points you might consider:

  • ctx is currently unused. If you want transfers to react more promptly to cancellation, you could check ctx.Err() or select on ctx.Done() inside the loop and abort with ctx.Err() when canceled.
  • Errors opening or copying a specific backup file are treated as fatal for the whole transfer, whereas missing or unreadable install logs are non‑fatal. That difference may be intentional; if backups are considered “optional” in some flows, you may want to mirror the log behavior (warn and continue) or at least log more context before returning.

None of these are blockers; the correctness of the current implementation is fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a07ed0e and 91f17f8.

📒 Files selected for processing (3)
  • server/transfer/archive.go (3 hunks)
  • server/transfer/source.go (3 hunks)
  • server/transfer/transfer.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
server/transfer/archive.go (2)
config/config.go (1)
  • Get (421-429)
server/transfer/transfer.go (1)
  • Transfer (43-60)
server/transfer/source.go (3)
server/transfer/archive.go (1)
  • Archive (177-181)
internal/progress/progress.go (1)
  • Progress (13-21)
server/transfer/transfer.go (1)
  • New (63-73)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.25.1, linux, amd64)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.24.7, linux, amd64)
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
server/transfer/transfer.go (1)

57-59: BackupUUIDs field wiring looks consistent with backup streaming logic

The semantics (“empty means no backups”) are clear and align with the checks in Archive.StreamBackups; no issues from this addition.

server/transfer/source.go (1)

35-53: Periodic upload progress goroutine looks fine

Using a.Progress() with a 5‑second ticker and guarding against nil progress is reasonable, and ctx2/tc.Stop() correctly scope the goroutine to the transfer lifecycle.

server/transfer/archive.go (2)

135-174: Install‑log streaming is correctly non‑fatal and aligns with cleanup behavior

StreamInstallLogs’s behavior—skip when the file doesn’t exist, and warn‑and‑continue on open/create/copy failures—matches the stated goal that install‑log issues should not abort a transfer. The log messages give good traceability, and keeping this logic centralized here works well with the caller’s simple invocation.


176-191: New transfer field on Archive is wired correctly

Adding transfer *Transfer and initializing it in NewArchive cleanly enables Archive methods to reuse logging, messages, and backup selection without changing external callers. No issues here.

Copy link
Contributor

@parkervcp parkervcp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rmartinoscar rmartinoscar linked an issue Dec 2, 2025 that may be closed by this pull request
@parkervcp parkervcp merged commit a4de1af into pelican-dev:main Dec 19, 2025
7 checks passed
@QuintenQVD0 QuintenQVD0 deleted the transfer branch December 26, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Also transfer (local) backups when transferring a server

2 participants