A Kubernetes controller written in Go that automatically manages updates for Team Fortress 2 (TF2) game servers running in a cluster. The controller monitors for game updates using SteamCMD, applies updates when available, and orchestrates pod restarts to ensure servers run the latest version.
The UpdateController is a Kubernetes-native controller that solves the challenge of keeping game servers up-to-date in a containerized environment. It leverages the ghcr.io/udl-tf/tf2-image container which includes SteamCMD for update management.
- Monitor: Periodically check for TF2 server updates using SteamCMD
- Update: Download and apply updates to mounted game directories
- Restart: Intelligently restart affected Kubernetes workloads (Deployments, StatefulSets, etc.)
- Handle Errors: Gracefully manage update failures with retry logic
graph TB
subgraph "Kubernetes Cluster"
UC[UpdateController<br/>Go Service]
K8S_API[Kubernetes API]
subgraph "Game Server Workloads"
TF2_POD1[TF2 Server Pod 1]
TF2_POD2[TF2 Server Pod 2]
TF2_POD3[TF2 Server Pod 3]
end
PVC[Persistent Volume<br/>Game Files]
end
STEAM[Steam CDN<br/>Update Server]
UC -->|Check Updates| STEAM
UC -->|Read/Write| PVC
UC -->|List/Restart Pods| K8S_API
K8S_API -->|Restart| TF2_POD1
K8S_API -->|Restart| TF2_POD2
K8S_API -->|Restart| TF2_POD3
TF2_POD1 -.->|Mount| PVC
TF2_POD2 -.->|Mount| PVC
TF2_POD3 -.->|Mount| PVC
style UC fill:#4a90e2,stroke:#2e5c8a,color:#fff
style STEAM fill:#1b2838,stroke:#000,color:#fff
style PVC fill:#326ce5,stroke:#1a4d99,color:#fff
sequenceDiagram
autonumber
participant UC as UpdateController
participant SC as SteamCMD
participant PV as Persistent Volume
participant K8S as Kubernetes API
participant POD as TF2 Server Pods
UC->>UC: Start Periodic Check<br/>(Configurable Interval)
alt Game Not Installed
UC->>PV: Check Game Directory
PV-->>UC: Not Found
UC->>UC: Flag as Update Needed<br/>(Initial Installation)
else Game Installed
UC->>PV: Read Local Manifest<br/>(appmanifest_*.acf)
PV-->>UC: Installed Build ID
UC->>SC: Query App Info (app_info_print)
SC-->>UC: Latest Build ID
UC->>UC: Compare Build IDs
end
alt Update/Install Needed
UC->>SC: Execute Update Script
SC->>PV: Download & Apply Update
alt Update Success
PV-->>SC: Update Complete
SC-->>UC: Success
UC->>SC: Validate Installation
SC->>PV: Verify Game Files
PV-->>SC: Validation Complete
SC-->>UC: Validation Success
UC->>K8S: Query Pods by Selector
K8S-->>UC: Return Matching Pods
UC->>UC: Determine Pod Owners
UC->>K8S: Restart Workloads<br/>(Deployments/StatefulSets/etc.)
K8S->>POD: Rolling Restart
POD->>PV: Mount Updated Files
UC->>UC: Log Success & Reset Retry Count
else 0x6 Error Detected
SC-->>UC: State 0x6 Error
UC->>UC: Detect 0x6 Error Pattern
UC->>PV: Clear steamapps Directory
PV-->>UC: Cleared
UC->>SC: Retry Update Script
SC->>PV: Download & Apply Update
alt Retry Success
PV-->>SC: Update Complete
SC-->>UC: Success
UC->>K8S: Restart Workloads
else Retry Failed
SC-->>UC: Failure
UC->>UC: Log Error & Increment Retry
UC->>UC: Wait Before Next Retry
end
else Other Update Failure
SC-->>UC: Failure
UC->>UC: Log Error & Increment Retry
UC->>UC: Wait Before Retry
end
else Already Up-to-Date
UC->>UC: Continue Monitoring
end
stateDiagram-v2
[*] --> Idle
Idle --> CheckingInstallation: Check Timer Triggered
CheckingInstallation --> InitialInstall: Game Not Installed
CheckingInstallation --> ComparingBuildIDs: Game Installed
InitialInstall --> Downloading: Start Initial Install
ComparingBuildIDs --> ReadLocalManifest: Get Installed Build ID
ReadLocalManifest --> QuerySteamAPI: Get Latest Build ID
QuerySteamAPI --> UpdateAvailable: Build IDs Differ
QuerySteamAPI --> Idle: Build IDs Match (Up-to-Date)
UpdateAvailable --> Downloading: Start Update
Downloading --> Installing: Download Complete
Downloading --> Error0x6Detected: State 0x6 Error
Downloading --> Failed: Other Download Error
Error0x6Detected --> ClearingSteamApps: Remove steamapps Directory
ClearingSteamApps --> Downloading: Retry After Cleanup
Installing --> Validating: Install Complete
Installing --> Error0x6Detected: State 0x6 Error
Installing --> Failed: Install Error
Validating --> DeterminingOwners: Validation Success
Validating --> Failed: Validation Error
DeterminingOwners --> RestartingWorkloads: Find Pod Owners
RestartingWorkloads --> Success: All Workloads Restarted
RestartingWorkloads --> Failed: Restart Error
Success --> Idle: Wait for Next Check
Failed --> Retry: Retry Count < Max
Failed --> Idle: Max Retries Exceeded
Retry --> Downloading: Retry Update
- Automatic Update Detection: Leverages SteamCMD to detect when TF2 updates are available using build ID comparison (no unnecessary downloads)
- Initial Installation Support: Automatically detects and performs initial game installation if not present
- Build ID Tracking: Compares local manifest build IDs with Steam's latest build IDs for efficient update detection
- 0x6 Error Recovery: Automatic detection and recovery from Steam's 0x6 state errors by clearing and retrying
- Smart Pod Selection: Restart pods based on:
- Label selectors (e.g.,
app=tf2-server) - Workload ownership detection
- Label selectors (e.g.,
- Multiple Workload Support: Handles Deployments, StatefulSets, DaemonSets, and ReplicaSets
- Error Handling: Configurable retry logic with exponential backoff
- Update Validation: Verifies update success before restarting pods
- Zero-Downtime Updates: Utilizes Kubernetes rolling restart mechanisms
- Observability: Structured logging with klog for detailed operation tracking
- Kubernetes cluster (v1.25+)
- Go 1.25+ (for development)
- Access to ghcr.io/udl-tf/tf2-image
- Persistent Volume for game files
- RBAC permissions for pod/deployment management
# Install the UpdateController from OCI registry
helm install update-controller oci://ghcr.io/udl-tf/helm/update-controller \
--namespace game-servers \
--create-namespace \
--set image.tag=latest \
--set config.checkInterval=30m
# Or specify a version
helm install update-controller oci://ghcr.io/udl-tf/helm/update-controller \
--version 0.1.0 \
--namespace game-servers \
--create-namespace# Apply the controller manifest
kubectl apply -f https://raw.githubusercontent.com/UDL-TF/UpdateController/main/deploy/controller.yaml# Clone the repository
git clone https://github.com/UDL-TF/UpdateController.git
cd UpdateController
# Build the controller
go build -o update-controller ./cmd/controller
# Run locally (for development)
./update-controller --kubeconfig=$HOME/.kube/config| Variable | Description | Default | Required |
|---|---|---|---|
CHECK_INTERVAL |
Interval between update checks | 30m |
No |
STEAMCMD_PATH |
Path to SteamCMD executable | /home/steam/steamcmd |
No |
STEAMAPP |
Steam app name (TF2) | tf |
No |
STEAMAPPID |
Steam app ID | 232250 |
No |
GAME_MOUNT_PATH |
Path where game files are mounted | /tf |
No |
UPDATE_SCRIPT |
Name of the update script | tf_update.txt |
No |
POD_SELECTOR |
Label selector for TF2 pods | app=tf2-server |
Yes |
MAX_RETRIES |
Maximum update retry attempts | 3 |
No |
RETRY_DELAY |
Delay between retries | 5m |
No |
NAMESPACE |
Kubernetes namespace to watch | default |
No |
The controller requires the following permissions:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: update-controller
rules:
- apiGroups: ['']
resources: ['pods']
verbs: ['get', 'list', 'watch']
- apiGroups: ['apps']
resources: ['deployments', 'statefulsets', 'daemonsets', 'replicasets']
verbs: ['get', 'list', 'patch']
- apiGroups: ['']
resources: ['persistentvolumeclaims']
verbs: ['get', 'list']UpdateController/
├── cmd/
│ └── controller/ # Main controller application
│ └── main.go
├── internal/
│ ├── controller/ # Controller logic
│ │ ├── update.go # Update check & apply
│ │ ├── restart.go # Pod restart logic
│ │ └── config.go # Configuration
│ ├── steamcmd/ # SteamCMD integration
│ │ └── client.go
│ └── k8s/ # Kubernetes client wrappers
│ └── client.go
├── deploy/ # Kubernetes manifests
│ ├── controller.yaml
│ └── rbac.yaml
├── Dockerfile
├── go.mod
├── go.sum
└── README.md
# Build for your platform
go build -o update-controller ./cmd/controller
# Build Docker image
docker build -t ghcr.io/udl-tf/update-controller:latest .
# Run tests
go test ./...
# Run with race detection
go test -race ./...# Install dependencies
go mod download
# Run controller locally
go run ./cmd/controller --kubeconfig=$HOME/.kube/config
# Enable debug logging
export LOG_LEVEL=debug
go run ./cmd/controllerSee LICENSE file for details.
- TF2Image - The base TF2 server image
- Kubernetes client-go - Kubernetes Go client library