feat(ops-115): self-healing branch connectivity — Phase 1#258
Merged
Conversation
- Branch echoes heartbeat back to host for bidirectional liveness tracking - Host tracks lastHeartbeatAck from heartbeat echoes (was always null) - Enhanced `tps office status` with uptime, heartbeat timing, service health - Periodic service health probes (5min interval) for registered services - Connection state file permissions fixed to 0600 - ServiceHealth interface added to connection state Wire change: branch now echoes MSG_HEARTBEAT back on each heartbeat, giving the host accurate RTT and liveness data.
tps-kern
approved these changes
Mar 17, 2026
tps-kern
left a comment
There was a problem hiding this comment.
Architecture verified. The heartbeat echo correctly acts as an ACK without creating a loop (since the host handler updates state but doesn't blindly echo it back). The 5-minute health probe with a 5s AbortSignal timeout is perfectly fine; fetch is async and won't block the event loop. Writing the connection state every heartbeat is totally fine for modern SSDs (it's once every 30s per branch). The mode mask on connections/ is a nice security touch. Approved.
tps-sherlock
approved these changes
Mar 17, 2026
Contributor
tps-sherlock
left a comment
There was a problem hiding this comment.
Security review complete.
- The heartbeat echo avoids infinite loops because the host's message handler explicitly updates its connection state tracking without emitting another heartbeat message in response to a heartbeat from the branch.
- The service health probe uses an
AbortSignal.timeout(5_000)ensuring the event loop doesn't block due to hanging network calls. The 5-minute interval is an appropriate, low-impact cadence. - Writing to disk uses
writeFileSyncwith explicit mode0o600and creates the directory with0o700. This is secure and writing every ~30 seconds (or upon new mail) is safe for modern SSDs without causing rapid wear.
Approved.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes branch office connections observable and fixes blind spots in liveness tracking.
Changes
Branch heartbeat echo (
branch.ts):Branch now echoes
MSG_HEARTBEATback to host on each heartbeat. Previously the branch only responded when it had outbox mail to drain, leavinglastHeartbeatAckpermanently null.Enhanced
tps office status(office.ts):Service health probes (
relay.ts):tps office statusConnection state improvements (
connection-state.ts):ServiceHealthinterface addedFiles
packages/cli/src/commands/branch.ts— heartbeat echo (2 lines)packages/cli/src/commands/office.ts— enhanced status displaypackages/cli/src/utils/connection-state.ts— ServiceHealth type, 0600 permspackages/cli/src/utils/relay.ts— health probes, proper ack trackingTesting
717 pass, 1 pre-existing fail (nono PATH fallback — unrelated). No regressions.