-
Notifications
You must be signed in to change notification settings - Fork 28
🤖 feat: SSH connection pool with backoff and singleflighting #922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
cfa5fa8 to
f611745
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Prevents thundering herd issues with SSH connections by: - Adding SSHConnectionPool class with health tracking - Implementing exponential backoff (1s → 5s → 10s → 20s → 40s → 60s cap) - Singleflighting concurrent connection attempts to same host - Probing unknown connections before first use - Skipping probes for known-healthy connections Integration points: - SSHRuntime.exec() and execSSHCommand() call acquireConnection() - PTYService calls acquireConnection() before spawning SSH terminals _Generated with mux_
f611745 to
53de00a
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } else if (runtime instanceof SSHRuntime) { | ||
| // SSH: Use node-pty to spawn SSH with local PTY (enables resize support) | ||
| const sshConfig = runtime.getConfig(); | ||
|
|
||
| // Ensure connection is healthy before spawning terminal | ||
| // This provides backoff protection and singleflighting for concurrent requests | ||
| await sshConnectionPool.acquireConnection(sshConfig); | ||
|
|
||
| const sshArgs = buildSSHArgs(sshConfig, workspacePath); | ||
|
|
||
| log.info(`[PTY] SSH terminal for ${sessionId}: ssh ${sshArgs.join(" ")}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Terminal failures never update SSH pool health
This block now marks the SSH target as healthy via sshConnectionPool.acquireConnection before spawning a terminal, but the PTY path never reports subsequent SSH failures back to the pool. If a host was previously healthy and later goes down, acquireConnection will keep fast-pathing because the cached status remains healthy, and the on-exit handler does not call reportFailure when ssh exits with code 255 or fails to spawn. As a result, terminal creation will loop without backoff or reprobe, defeating the new thundering-herd protection for PTY sessions once a healthy host becomes unreachable.
Useful? React with 👍 / 👎.
Prevents thundering herd issues with SSH connections by adding health tracking, exponential backoff, and singleflighting to the connection pool.
Changes
SSHConnectionPool class with:
Integration points:
SSHRuntime.exec()andexecSSHCommand()callacquireConnection()PTYServicecallsacquireConnection()before spawning SSH terminalsFlow
Generated with
mux