Skip to content

Conversation

@ahundt
Copy link

@ahundt ahundt commented Dec 29, 2025

@bra1nDump here is the merged pull request for the cli we discussed with the various crash bugfixes and the backend necessary for tmux to work.

ahundt and others added 30 commits November 15, 2025 01:57
- Add comprehensive tmux utilities (adapted from Python reference)
- Support tmux session spawning when TMUX_SESSION_NAME is set in profiles
- Automatically detect tmux availability and fall back to regular spawning
- Create new windows in existing tmux sessions with descriptive names
- Include proper environment variable injection for both tmux and regular spawning
- Update TrackedSession interface to support tmux session tracking
- Add Andrew Hundt copyright to tmux utilities

This enables users to see their Claude/Codex sessions directly in tmux terminals
when tmux session names are configured in their profiles, while maintaining
full backward compatibility when tmux is not available.

Files affected:
- src/utils/tmux.ts: Complete tmux utilities with session management
- src/daemon/run.ts: Integrated tmux spawning with fallback logic
- src/daemon/types.ts: Added tmuxSessionId to TrackedSession
- src/modules/common/registerCommonHandlers.ts: Added tmux environment variables
- src/persistence.ts: Profile persistence with tmux settings
…ux utilities

Summary:
- Integrated unified AI backend profile system from happy app with daemon session spawning
- Enhanced tmux utilities with comprehensive TypeScript typing and session identifier validation
- Replaced manual environment variable filtering with profile-based agent compatibility system

Previous behavior:
- Manual environment variable filtering based on agent type with hardcoded variable lists
- Basic tmux utilities with limited type safety and session management
- Flat profile configuration structure in persistence layer
- No validation for tmux session identifiers or window operations

What changed:
- src/daemon/run.ts: Replaced manual environment filtering with getProfileEnvironmentVariablesForAgent() using profile compatibility validation, updated tmux session name handling to use profile-based environment variables
- src/persistence.ts: Updated AIBackendProfile interface to match happy app schema with nested agent configurations, added validateProfileForAgent() and getProfileEnvironmentVariables() helper functions, implemented profile versioning system
- src/utils/tmux.ts: Added comprehensive TypeScript typing with TmuxSessionIdentifier interface, TmuxControlSequence and TmuxWindowOperation union types, implemented parseTmuxSessionIdentifier() and formatTmuxSessionIdentifier() for validation, enhanced TmuxUtilities class with strongly typed methods

Why:
- Enable seamless profile synchronization between happy app and happy-cli
- Provide type-safe tmux session management with automatic session/window parsing
- Add robust validation for tmux operations to prevent runtime errors
- Support multiple AI providers through unified profile system
- Improve developer experience with comprehensive TypeScript types and error handling

Files affected:
- src/daemon/run.ts: Profile integration in session spawning
- src/persistence.ts: Unified profile schema and helper functions
- src/utils/tmux.ts: Enhanced tmux utilities with TypeScript typing

Testable:
- Spawn sessions with different AI provider profiles and verify correct environment variables
- Test tmux session creation with automatic session identifier parsing and validation
- Verify profile compatibility validation prevents incompatible agent configurations
- Test enhanced tmux utilities with strong typing and error handling
CRITICAL FIX - CLI now uses exact same Zod validation as GUI for data integrity

Major schema fixes:
- Replaced plain TypeScript interfaces with Zod schemas matching GUI exactly
- Added runtime validation for all profile data using validateProfile() function
- Ensured identical field validation (UUIDs, string lengths, regex patterns)
- Added proper environment variable name validation with regex
- Implemented same default values and field requirements as GUI

Schema Consistency Achieved:
- AIBackendProfileSchema now identical between GUI and CLI
- All configuration schemas (Anthropic, OpenAI, Azure, TogetherAI) match exactly
- Environment variable validation with regex pattern matching
- Profile compatibility schema with proper boolean defaults
- Same version validation and compatibility checking functions

Data Integrity Impact:
- PREVENTS invalid profile data from corrupting CLI operations
- ENSURES consistent validation across GUI and CLI boundaries
- GUARDS against malformed profiles breaking session creation
- MAINTAINS type safety throughout the entire system

Validation Implementation:
- Added validateProfile() function using Zod safeParse with proper error reporting
- Updated updateProfiles() to validate all incoming profile data
- Preserved all existing CLI functionality while adding safety
- Backward compatible with existing valid profile data

This addresses a critical data integrity risk where the CLI could accept invalid
profile data that would cause session creation failures or system instability.
…atibility

SCHEMA MIGRATION:
- Add SUPPORTED_SCHEMA_VERSION = 2 constant
- Add schemaVersion field to Settings interface
- Implement automatic v1 → v2 migration on settings load
- Migration adds profiles[] and localEnvironmentVariables{}

VALIDATION:
- Validate profiles using AIBackendProfileSchema (Zod) on load
- Skip invalid profiles with warning (don't crash)
- Log clear error messages for debugging
- Merge defaults for any missing fields

VERSION DETECTION:
- Warn when settings schema newer than supported version
- Clear upgrade messages logged for users
- Ensure schema version always written on save
- Compatible with GUI schema version system

ERROR HANDLING:
- Graceful profile validation with try-catch per profile
- Invalid profiles skipped, not crashed
- Settings file corruption handled with fallback to defaults
- All errors logged with context

BACKWARDS COMPATIBILITY:
- Old settings (v1) automatically migrated to v2
- Missing fields filled with defaults
- Unknown fields preserved (forward compatibility)
- Zero breaking changes to existing functionality

FILES MODIFIED:
- src/persistence.ts:
  * Added SUPPORTED_SCHEMA_VERSION constant
  * Updated Settings interface with schemaVersion
  * Added migrateSettings() function
  * Enhanced readSettings() with migration and validation
  * Updated writeSettings() to ensure schema version
  * Added logger import for warnings

TECHNICAL DETAILS:
- Uses Zod AIBackendProfileSchema for validation
- Per-profile validation loop with error recovery
- Schema version defaults to 1 for old files
- Migration is idempotent and safe to re-run

Tested scenarios:
- ✅ V1 → V2 migration with profile array creation
- ✅ Invalid profile graceful handling with warnings
- ✅ Schema version mismatch warnings
- ✅ Backwards compatibility with v1 settings
- ✅ writeSettings always includes schema version

Related to yolo-mode-persistence and profile management feature
Coordinated with GUI schema version system (both use v2)
Previous behavior:
- README.md contained 83 lines of developer-only documentation (lines 49-131)
- Development setup instructions buried in user-facing README
- USAGE.md existed but not discoverable from README
- Mixed audience (end users + developers) in single file

What changed:
- USAGE.md → CONTRIBUTING.md: Renamed for conventional structure
- CONTRIBUTING.md: Now titled "Contributing to Happy CLI"
- README.md: Removed 83-line development section
- README.md: Added concise "Contributing" section linking to CONTRIBUTING.md
- README.md: Now focused solely on end-user installation and usage

Why:
- Follows standard open-source convention (CONTRIBUTING.md)
- Clearer separation: README for users, CONTRIBUTING for developers
- More discoverable (CONTRIBUTING.md is GitHub convention)
- Reduces cognitive load for new users
- README now concise at 60 lines (was 143 lines)

Files affected:
- README.md: Removed development section, added Contributing link (saved 80 lines)
- USAGE.md → CONTRIBUTING.md: Renamed with updated title
- No content changes to development docs (only reorganized)

Testable:
- README.md is now 60 lines (was 143)
- README.md contains link to CONTRIBUTING.md
- CONTRIBUTING.md has same content as old USAGE.md
- GitHub automatically shows CONTRIBUTING.md in contributor guidelines
…entVariables

Previous behavior:
- Daemon ignored options.environmentVariables sent from GUI (lines 270-312)
- Always used CLI local settings.activeProfileId regardless of GUI selection
- GUI profile selection in wizard was completely non-functional
- User selected DeepSeek profile in GUI but daemon used CLI default profile
- Silent failure - no error message or logging to indicate wrong profile active
- Wrong precedence: { ...profileEnvVars, ...extraEnv } meant auth overwrote profile settings

What changed:
- src/daemon/run.ts (lines 269-327): Restructured environment variable handling
  - Split into authEnv and profileEnv for clear precedence
  - Check options.environmentVariables first (GUI profile)
  - Fallback to CLI local settings.activeProfileId only if no GUI profile
  - Final merge: { ...profileEnv, ...authEnv } protects auth tokens
  - Added extensive logging: info for GUI profile, debug for CLI profile, debug for final keys
  - Log keys only (not values) for security
- src/persistence.ts (line 287): Changed logger.error to logger.warn (Logger has no error method)

Why:
- Fixes critical UX bug where GUI profile selection was silently ignored
- Implements correct precedence: GUI profile > CLI local profile > none
- Protects authentication tokens from being overridden by profile settings
- Enables debugging via logs to verify which profile source is active
- Backward compatible: CLI-only workflows without GUI still use local activeProfileId
- Forward compatible: GUI profiles work when provided

Technical details:
- options.environmentVariables is optional field from SpawnSessionOptions (registerCommonHandlers.ts:124)
- GUI sends it via machineSpawnNewSession in ops.ts:183
- Daemon receives it but was ignoring it until this fix
- Auth env vars (CLAUDE_CODE_OAUTH_TOKEN, CODEX_HOME) must never be overridden by profiles
- Profile env vars can customize API endpoints (ANTHROPIC_BASE_URL) and tmux settings (TMUX_SESSION_NAME)

Files affected:
- src/daemon/run.ts: Restructured env var handling (lines 269-327, +19 lines, restructured logic)
- src/persistence.ts: Fixed logger call (line 287, logger.error → logger.warn)

Testable:
- GUI selects DeepSeek profile → daemon logs "Using GUI-provided profile (N vars)"
- CLI-only spawn → daemon logs "No GUI profile provided, loading CLI local active profile"
- Auth token always present in final extraEnv even if profile sets same key
- Logs show env var keys (not values) for debugging without exposing secrets
Previous behavior:
- Only happy binary available globally
- Developers needed npm run dev:variant for development version
- No global command to quickly access dev data directory
- Required cd to project directory to use npm scripts

What changed:
- package.json: Added happy-dev to bin field (line 13)
- bin/happy-dev.mjs (new, 41 lines): Wrapper binary setting HAPPY_HOME_DIR to ~/.happy-dev and HAPPY_VARIANT to dev
- Follows same pattern as bin/happy.mjs but sets dev environment
- Sets environment before importing dist/index.mjs

Why:
- Enables global happy-dev command from anywhere on system
- Automatic environment: HAPPY_HOME_DIR=~/.happy-dev, HAPPY_VARIANT=dev
- Consistent with happy and happy-mcp global binaries
- No need to cd to project directory or use npm scripts
- Simpler workflow: happy-dev daemon start vs npm run dev:daemon:start

Files affected:
- package.json: Added happy-dev binary (line 13)
- bin/happy-dev.mjs (new, 41 lines): Dev variant wrapper binary

Testable:
- npm install -g . creates /opt/homebrew/bin/happy-dev symlink
- happy-dev --version shows "🔧 DEV MODE - Data: ~/.happy-dev"
- happy-dev daemon status checks ~/.happy-dev/daemon.state.json
- happy (stable) and happy-dev can be used simultaneously
…ile schema

Summary: Extend profile schema to store default permission and model modes

What changed:
- Lines 82-85: Added defaultPermissionMode and defaultModelMode optional fields
- Both fields are strings (validated at usage time)
- Schemas remain backward compatible (optional fields)

Why:
- Permission mode should be profile-specific, not session-specific
- Each AI provider profile can have appropriate default permissions
- Matches GUI schema extension for consistency

Files affected:
- src/persistence.ts: AIBackendProfileSchema updated

Testable: Profiles with permission modes can be created and loaded
Summary: Profiles can now store default session type (simple vs worktree)

What changed:
- Line 82: Added defaultSessionType field to AIBackendProfileSchema
- Type: z.enum(['simple', 'worktree']).optional()

Why:
- Session type should be profile-specific like permission mode
- DeepSeek might default to 'worktree', Anthropic to 'simple', etc.
- Matches GUI schema for compatibility

Files affected:
- src/persistence.ts

Testable: Profiles with session types can be saved and loaded
…for all spawn modes

Previous behavior:
- Tmux mode: ${VAR} expansion worked via shell (export KEY="${VAR}";)
- Non-tmux mode: ${VAR} passed as LITERAL STRING to Node.js spawn
- Z.AI and DeepSeek profiles broken in non-tmux mode
- ANTHROPIC_AUTH_TOKEN="${Z_AI_AUTH_TOKEN}" sent literally to session
- Claude CLI received string "${Z_AI_AUTH_TOKEN}" instead of actual key
- Multi-backend sessions impossible without tmux

What changed:
- src/utils/expandEnvVars.ts: New utility for ${VAR} expansion
  - expandEnvironmentVariables() function with regex replacement
  - Replaces ${VARNAME} with process.env[VARNAME]
  - Keeps ${VAR} unchanged if variable not found (debugging)
  - Works for any Record<string, string> environment object

- src/daemon/run.ts:
  - Line 24: Import expandEnvironmentVariables
  - Line 327: Changed const to let for extraEnv (will be reassigned)
  - Lines 330-334: Expand ${VAR} after merging profileEnv + authEnv
  - Expansion happens BEFORE tmux vs non-tmux decision
  - Both spawn modes now receive expanded values
  - Added debug logging for before/after expansion

How ${VAR} expansion works now:
1. GUI profile has: ANTHROPIC_AUTH_TOKEN=${Z_AI_AUTH_TOKEN}
2. Daemon receives: profileEnv = { ANTHROPIC_AUTH_TOKEN: "${Z_AI_AUTH_TOKEN}" }
3. Daemon merges: extraEnv = { ...profileEnv, ...authEnv }
4. Daemon expands: expandEnvironmentVariables(extraEnv, process.env)
   - Finds ${Z_AI_AUTH_TOKEN} in value
   - Looks up process.env.Z_AI_AUTH_TOKEN
   - Replaces ${Z_AI_AUTH_TOKEN} with actual value
   - Result: { ANTHROPIC_AUTH_TOKEN: "sk-real-key" }
5. Spawn (both modes) gets: ANTHROPIC_AUTH_TOKEN=sk-real-key (actual value)

Why:
- Multi-backend sessions: Run Anthropic + Z.AI + DeepSeek simultaneously
- Works in all modes: tmux and non-tmux both get expanded values
- Security: Credentials stay in daemon environment, not transmitted from GUI
- Flexibility: Same daemon serves multiple backends via profile selection
- Debugging: Undefined variables kept as ${VAR} (shows what's missing)

Files affected:
- src/utils/expandEnvVars.ts: Variable expansion utility (new file)
- src/daemon/run.ts: Apply expansion before spawning sessions

Testable:
- Launch daemon: Z_AI_AUTH_TOKEN=sk-abc happy daemon start
- Create session with Z.AI profile (has ANTHROPIC_AUTH_TOKEN=${Z_AI_AUTH_TOKEN})
- Non-tmux mode: Session receives ANTHROPIC_AUTH_TOKEN=sk-abc (expanded)
- Tmux mode: Still works as before
- Create second session with Anthropic profile (no substitution)
- Both sessions run simultaneously with different backends
…vironment

Previous behavior:
- Undefined variables silently kept as ${VAR} placeholders
- Session spawned with ANTHROPIC_AUTH_TOKEN="${Z_AI_AUTH_TOKEN}" (literal string)
- Claude CLI failed with cryptic authentication error
- No indication what went wrong

What changed:
- src/utils/expandEnvVars.ts:
  - Line 1: Import logger
  - Line 34: Track undefinedVars array during expansion
  - Lines 43-44: Add varName to undefinedVars when not found in sourceEnv
  - Lines 52-59: Log warning if any variables undefined
    - Lists all undefined variables
    - Shows exact commands to set them
    - Example: "Set these in daemon environment: Z_AI_AUTH_TOKEN=<your-value>"

Warning output example:
```
[EXPAND ENV] Undefined variables referenced in profile environment: Z_AI_AUTH_TOKEN, Z_AI_BASE_URL
[EXPAND ENV] Session may fail to authenticate. Set these in daemon environment before launching:
[EXPAND ENV]   Z_AI_AUTH_TOKEN=<your-value>
[EXPAND ENV]   Z_AI_BASE_URL=<your-value>
```

Why:
- Easy to use correctly: Clear error message explains what's missing
- Hard to use incorrectly: Warning appears immediately, before session fails
- Actionable: Shows exact variable names and how to set them
- Debugging: Users know to check daemon launch command

Files affected:
- src/utils/expandEnvVars.ts: Warning logging for undefined variables

Testable:
- Launch daemon without Z_AI_AUTH_TOKEN
- Create session with Z.AI profile
- Warning logged: "Undefined variables: Z_AI_AUTH_TOKEN"
- Session spawns but authentication fails (expected)
- User knows to relaunch daemon with Z_AI_AUTH_TOKEN=...
…nt switching

Previous behavior:
- Single data directory (~/.happy/) shared by all instances
- No way to run stable and development versions simultaneously
- Manual environment variable management required
- No visual feedback about which version is running

What changed:
- scripts/env-wrapper.cjs: Cross-platform wrapper setting HAPPY_HOME_DIR per variant
- scripts/setup-dev.cjs: One-command automated setup creating ~/.happy/ and ~/.happy-dev/
- package.json: Added npm scripts for stable/dev variant management
  - npm run stable/dev:variant for any command
  - npm run stable/dev:daemon:start/stop/status for quick daemon control
  - npm run stable/dev:auth for authentication
  - npm run setup:dev for initial setup
- src/configuration.ts: Variant validation and visual indicators (✅ STABLE / 🔧 DEV)
- .gitignore: Added .envrc and .happy-dev/ to ignore list
- README.md: Added comprehensive development section with usage examples
- USAGE.md: Complete workflow guide with troubleshooting
- .envrc.example: Optional direnv configuration for automatic switching

Why:
- Enables concurrent stable and development versions with complete data isolation
- Prevents development work from interfering with production workflows
- Cross-platform support via Node.js (Windows/macOS/Linux)
- Visual feedback ensures users always know which variant is active
- Discoverable commands in package.json reduce cognitive load
- Optional direnv integration for power users

Files affected:
- scripts/env-wrapper.cjs (new): Environment wrapper with visual feedback
- scripts/setup-dev.cjs (new): Automated setup script
- package.json: Added 16 new npm scripts for variant management
- src/configuration.ts: Added variant validation (lines 59-74)
- .gitignore: Added .envrc and .happy-dev/
- README.md: Added 83-line development section
- USAGE.md (new): 191-line comprehensive guide
- .envrc.example (new): Optional direnv configuration

Testable:
- npm run setup:dev creates both ~/.happy/ and ~/.happy-dev/
- npm run stable:daemon:status shows ✅ STABLE and ~/.happy/ data location
- npm run dev:daemon:status shows 🔧 DEV and ~/.happy-dev/ data location
- Both daemons can run simultaneously with separate ports and state
- Visual indicators always display which variant is active
… unspecified

Previous behavior:
- Empty TMUX_SESSION_NAME treated as undefined
- if (!tmuxSessionName) → disabled tmux
- Always fell back to regular shell spawn
- Tmux checkbox in GUI didn't work with empty field

What changed:
- src/daemon/run.ts:
  - Lines 340-342: Added comment explaining empty string behavior
  - Line 346: Changed if (!tmuxSessionName) to if (tmuxSessionName === undefined)
  - Empty string ('') now valid for tmux mode
  - Line 353: Changed if (useTmux && tmuxSessionName) to if (useTmux && tmuxSessionName !== undefined)
  - Line 355-356: sessionDesc shows "current/most recent session" when empty

- src/utils/tmux.ts (spawnInTmux function):
  - Lines 739-760: Session name resolution logic
    - undefined or empty string → query existing sessions
    - Run: tmux list-sessions -F '#{session_name}'
    - If sessions exist → use first one from list
    - If no sessions → create/use "happy" session
    - Specific name → use that session (create if needed)
  - Line 740: sessionName !== undefined && sessionName !== '' check
  - Lines 748-759: Query and select first existing session
  - Line 765: ensureSessionExists after resolution (concrete name)
  - Line 768: Always uses -t flag with resolved session name

Execution flow (empty session name):
1. GUI sends: TMUX_SESSION_NAME='' (empty string from profile)
2. Daemon: tmuxSessionName = '' → useTmux = true (not undefined)
3. spawnInTmux called with sessionName=''
4. Query: tmux list-sessions -F '#{session_name}'
5. If exists: Use first session (e.g., "work")
6. If none: Use "happy"
7. Create window in resolved session
8. Track with concrete session name

Why:
- Respects existing tmux setup: Uses user's first session
- Smart fallback: Creates "happy" only when needed
- Deterministic: First session is predictable
- Easy to use correctly: Empty field = smart default
- Hard to use incorrectly: Always has concrete session name for tracking

Files affected:
- src/daemon/run.ts: Handle empty string for tmux
- src/utils/tmux.ts: Query and select first existing session

Testable:
- Have tmux session "work" running
- Create session with empty TMUX_SESSION_NAME
- Session spawns in window in "work" session
- No tmux sessions running
- Create session → spawns in new "happy" session
CRITICAL FIXES:
- tmux.ts:377: Fixed JavaScript array indexing bug ([-1] doesn't work in JS)
  * Before: parts[1].split('/')[-1] → always undefined
  * After: Proper array.length-1 indexing
  * Impact: TMUX environment parsing now works correctly

- daemon/run.ts:336: Fixed overly strict auth variable validation
  * Only validates variables that ARE SET (agent-agnostic)
  * Handles Claude OAuth (no ANTHROPIC_AUTH_TOKEN needed)
  * Handles Codex-only sessions (no Claude vars needed)
  * Impact: Sessions no longer fail unnecessarily with OAuth

TEST COVERAGE ADDED:
- src/utils/tmux.test.ts: 51 tests for session identifier parsing/validation
  * Tests the critical array indexing fix
  * Pure unit tests (no tmux installation required)

- src/utils/expandEnvVars.test.ts: 17 tests for ${VAR} expansion
  * Auth token expansion patterns
  * Partial expansion handling
  * Malformed reference handling

DOCUMENTATION:
- CONTRIBUTING.md: Added "Testing Profile Sync Between GUI and CLI" section
  * Step-by-step profile sync testing workflow
  * Schema compatibility testing guidelines
  * Common issues and troubleshooting

- persistence.ts: Added clarifying comments for schema versions
  * CURRENT_PROFILE_VERSION: Semver for AIBackendProfile schema
  * SUPPORTED_SCHEMA_VERSION: Integer for Settings migrations

TEST RESULTS:
✓ 51 tmux tests pass (9ms)
✓ 17 env expansion tests pass (35ms)
✓ Total: 173 tests passed

Files affected:
- src/utils/tmux.ts: Fixed array indexing bug
- src/daemon/run.ts: Fixed auth validation logic
- src/persistence.ts: Added schema version comments
- CONTRIBUTING.md: Added profile sync testing docs
- src/utils/tmux.test.ts: New test file (51 tests)
- src/utils/expandEnvVars.test.ts: New test file (17 tests)

Fixes maintainer code review blocking issues.
…ling

CRITICAL FIX:
- src/persistence.ts:147: Fixed empty string tmux sessionName filtering
  * Before: if (profile.tmuxConfig.sessionName) → WRONG (truthy check)
  * After: if (profile.tmuxConfig.sessionName !== undefined) → CORRECT
  * Impact: Empty string now correctly passed to sessions (matches GUI behavior)

Previous behavior:
- GUI documents: empty string = "use current/most recent session"
- GUI correctly uses !== undefined check (settings.ts:171)
- CLI incorrectly used truthy check, dropped empty strings
- Result: GUI-created profiles with sessionName="" silently lost feature

What changed:
- CLI now matches GUI logic exactly
- Empty strings preserved and passed to TMUX_SESSION_NAME
- Users can use "current/most recent session" feature from GUI

Why:
Cross-repository compatibility - CLI must handle all valid GUI profile configurations.
Empty string is a documented valid value meaning "use current/most recent tmux session".

CROSS-REPO ANALYSIS:
- Added CROSS_REPO_COMPATIBILITY_REPORT.md documenting:
  * 100% schema compatibility verification (17 fields, 7 sub-schemas)
  * Complete API communication flow analysis
  * Security properties (encryption, validation)
  * 3 additional bugs found in GUI repo requiring separate fixes

Files affected:
- src/persistence.ts: Fixed getProfileEnvironmentVariables() tmux sessionName check
- CROSS_REPO_COMPATIBILITY_REPORT.md: NEW comprehensive compatibility analysis

Identified similar bugs in GUI repo (require separate fixes):
- sources/sync/settings.ts:172: tmpDir has same truthy check bug
- sources/utils/sessionUtils.ts:84: Array .pop() without null check
- sources/utils/parseToken.ts:5: Token parsing without validation

Cross-repository validation ensures GUI and CLI maintain 100% compatibility.
Remove CROSS_REPO_COMPATIBILITY_REPORT.md as it's a one-off analysis document
not meant for long-term repository maintenance. The analysis findings have been
incorporated into code fixes and CONTRIBUTING.md documentation.
…ve AI backends

Fixed critical bug where environment variables from AI backend profiles (Z.AI, DeepSeek, etc.) were not being passed to CLI sessions launched in tmux mode, causing authentication failures and preventing alternative backends from working.

Previous behavior:
- spawnInTmux() accepted env parameter but never used it (src/utils/tmux.ts:729)
- daemon/run.ts attempted workaround with export statements in command string
- Environment variables were not set in tmux window's environment
- Z.AI, DeepSeek, and other alternative backends failed to authenticate in tmux mode

What changed:
- src/utils/tmux.ts:790-815: Implemented environment variable handling via tmux -e flag
- src/utils/tmux.ts:200-213: Made env a separate parameter for clarity and efficiency
- src/utils/tmux.ts:806-810: Added comprehensive escaping (backslashes, quotes, dollar signs, backticks)
- src/utils/tmux.ts:793-802: Added validation for undefined values and invalid variable names
- src/daemon/run.ts:378-393: Removed redundant export statements, pass env as third parameter
- Only profile environment variables passed (not process.env) for efficiency

Why:
- Tmux windows inherit environment from tmux server, not from client command
- Must use tmux's -e flag to set variables in window environment
- Proper escaping prevents shell injection attacks
- Validation ensures robust handling of NodeJS.ProcessEnv type (string | undefined)
- Passing only profile vars avoids 50+ unnecessary -e flags and command-line length issues

Files affected:
- src/utils/tmux.ts: Fixed spawnInTmux() to use env parameter, added escaping and validation
- src/daemon/run.ts: Updated to use new spawnInTmux() signature with env as third parameter

Testable:
- Launch session with Z.AI profile in tmux mode: ANTHROPIC_AUTH_TOKEN now correctly set
- Launch session with DeepSeek profile in tmux mode: environment variables pass through
- Environment variables with special characters (quotes, $vars, `backticks`) properly escaped
- Invalid/undefined environment variables skipped with warning logs
Fixed critical bug where RPC handler received environment variables from GUI but dropped them before calling spawnSession, causing all alternative AI backends (Z.AI, DeepSeek, etc.) to fail authentication and fall back to default Anthropic credentials.

Previous behavior:
- RPC handler extracted 6 fields from params but not environmentVariables (src/api/apiMachine.ts:105)
- Passed only {directory, sessionId, machineId, approvedNewDirectoryCreation, agent, token} to spawnSession
- environmentVariables field was in params but not destructured or forwarded (line 112)
- Daemon logs showed "Spawning session with params: {7 environmentVariables}" but then "Final environment variable keys (0)"
- GUI sent 7 environment variables, RPC handler received them, but lost them before spawning
- All sessions spawned with empty environment, causing authentication to fall back to default credentials
- Z.AI sessions connected to Anthropic instead of Z.AI (responded "Sonnet 4.5" instead of "GLM-4.6")

What changed:
- src/api/apiMachine.ts:105: Added environmentVariables to destructuring assignment
- src/api/apiMachine.ts:112: Added environmentVariables to spawnSession call parameters
- Now all 7 fields are extracted and forwarded to spawnSession function

Why:
- RPC handler is the bridge between GUI (sending environmentVariables via WebSocket) and daemon spawn logic
- Destructuring assignment must extract ALL fields from params to avoid silent data loss
- Missing field extraction is a common JavaScript bug - object has field but it's not being used
- This bug completely broke GUI profile selection for alternative backends (Z.AI, DeepSeek, Azure, etc.)

Files affected:
- src/api/apiMachine.ts: Added environmentVariables to destructuring and function call (2 locations)

Testable:
- Select Z.AI profile in GUI, create session, ask "what model are you" → should respond "GLM-4.6" not "Sonnet 4.5"
- Daemon logs should show "Using GUI-provided profile environment variables (7 vars)"
- Daemon logs should show "Final environment variable keys (before expansion) (7): ..." not "(0):"
- Alternative backends (DeepSeek, Azure OpenAI) should now authenticate correctly
…ntax

Summary: Profile environment variables now properly expand ${VAR:-default} syntax to use default values when daemon environment variables are not set.

Previous behavior: The expandEnvironmentVariables function only supported simple ${VAR} syntax. When profiles used bash parameter expansion with default values like ${Z_AI_BASE_URL:-https://api.z.ai/api/anthropic}, the function would try to look up the entire string "Z_AI_BASE_URL:-https://api.z.ai/api/anthropic" as a variable name, fail, and keep the unexpanded placeholder.

What changed:
- Parse ${VAR:-default} expressions to separate variable name from default value
- Look up only the variable name in daemon's process.env
- If variable exists, use its value (even if empty string)
- If variable doesn't exist but default provided, use the default value
- If variable doesn't exist and no default, keep placeholder and warn
- Add logging to show which variables are expanded vs using defaults
- Add warning when variable is set but empty (common mistake)
- Mask sensitive values (containing "token", "key", "secret") in logs

Why: Z.AI and other built-in profiles use ${VAR:-default} syntax to provide sensible defaults while allowing daemon environment variables to override them. Without this fix, these profiles would fail to work unless ALL variables were set in the daemon's environment, defeating the purpose of defaults.

Files affected:
- src/utils/expandEnvVars.ts: Enhanced expansion logic with bash parameter expansion support and better logging

Testable: Create session with Z.AI profile. Check daemon logs show "Using default value for Z_AI_BASE_URL" if Z_AI_BASE_URL not set in daemon environment, or "Expanded Z_AI_BASE_URL from daemon env" if set.
…s and acceptEdits modes by removing hardcoded override and strengthening schema validation

Previous behavior (based on git diff):
- claudeRemote.ts:114 forced all non-'plan' permission modes to 'default' via ternary operator
- persistence.ts:85 accepted any string for defaultPermissionMode (z.string().optional())
- api/types.ts:232 accepted any string for message-level permissionMode (z.string().optional())
- User selections of 'bypassPermissions' or 'acceptEdits' were silently overridden to 'default'
- Invalid permission modes could be stored in profiles and messages without validation

What changed:
- claudeRemote.ts:114 - removed ternary operator, now passes through initial.mode.permissionMode directly to SDK
- persistence.ts:85 - changed schema from z.string().optional() to z.enum(['default', 'acceptEdits', 'bypassPermissions', 'plan']).optional()
- api/types.ts:232 - changed schema from z.string().optional() to z.enum(['default', 'acceptEdits', 'bypassPermissions', 'plan', 'read-only', 'safe-yolo', 'yolo']).optional()

Why:
- The hardcoded override in claudeRemote.ts was the root cause preventing bypassPermissions and acceptEdits from working
- SDK QueryOptions interface (src/claude/sdk/types.ts:169) accepts all four Claude modes: 'default' | 'acceptEdits' | 'bypassPermissions' | 'plan'
- Validation already occurs upstream in runClaude.ts:170-182 against the whitelist ['default', 'acceptEdits', 'bypassPermissions', 'plan']
- Schema validation strengthening prevents invalid modes from being stored, matching happy app's MessageMetaSchema (sources/sync/typesMessageMeta.ts:6)
- Ensures consistency across the communication pipeline between happy app and happy-cli
- Codex pathway already correctly passes through mode without override (verified in runCodex.ts:621,629)

Files affected:
- src/claude/claudeRemote.ts:114 - removed hardcoded permission mode override to 'default'
- src/persistence.ts:85 - strengthened AIBackendProfile schema validation for Claude modes
- src/api/types.ts:232 - strengthened MessageMeta schema validation for all modes (Claude + Codex)

Testable:
- In happy app, select bypassPermissions mode for a session
- Send message requiring tool approval
- Verify tools are auto-approved without permission prompts (bypass behavior)
- Verify mode persists across multiple messages
- Test all modes: default, acceptEdits, bypassPermissions, plan
- Verify TypeScript compilation passes (yarn typecheck)
…e and Codex modes

Previous behavior (based on git diff):
- types.ts:3 imported PermissionMode from '@/claude/loop' which only included 4 Claude modes
- AgentState.completedRequests[].mode used this Claude-only type but handles both Claude and Codex sessions
- Created type/runtime mismatch: TypeScript type allowed 4 modes but Zod schema validated 7 modes
- Codex modes ('read-only', 'safe-yolo', 'yolo') were not included in the imported type

What changed:
- types.ts:3-8 - removed import of Claude-specific PermissionMode type
- types.ts:8 - defined complete PermissionMode type with all 7 modes (Claude + Codex)
- Added documentation comment noting it must match MessageMetaSchema.permissionMode enum

Why:
- AgentState type is used by both Claude and Codex sessions, so needs complete type definition
- The Zod schema MessageMetaSchema.permissionMode already correctly validates all 7 modes
- This eliminates type/runtime inconsistency where TypeScript type was narrower than runtime validation
- Ensures type safety for Codex permission modes in AgentState.completedRequests

Files affected:
- src/api/types.ts:3-8 - defined complete PermissionMode type instead of importing Claude-only type

Testable:
- Verify TypeScript compilation passes (yarn typecheck)
- Type system now correctly allows all 7 permission modes in AgentState
- Matches MessageMetaSchema enum: ['default', 'acceptEdits', 'bypassPermissions', 'plan', 'read-only', 'safe-yolo', 'yolo']
Previous behavior:
- new-window command had duplicate -t flags (manual + automatic)
- Working directory (cwd option) was completely ignored
- Command execution used unreliable two-step process (create window, then send-keys)

What changed:
- Removed manual -t from createWindowArgs, let executeTmuxCommand add it via session parameter
- Added -c flag support to set working directory from options.cwd
- Changed to single-step: pass command directly to new-window (executes on creation)
- Removed unreliable send-keys step

Why:
- Duplicate -t flags caused tmux to reject the command
- Missing cwd meant sessions started in wrong directory
- Direct command passing is more reliable than send-keys

Files affected:
- src/utils/tmux.ts:785-836 - Complete rewrite of window creation logic

Testable:
- Create tmux session with profile in specific directory
- Window should appear in correct tmux session
- Happy CLI should start in specified working directory
- Environment variables should be properly set
Previous behavior:
- setTimeout hack created fake session ID after 2 seconds
- No actual connection between tmux spawn and Happy CLI session
- GUI received meaningless session ID that couldn't communicate with process
- Complex webhook matching systems attempted to solve this problem

What changed:
- tmux.ts: Added -P -F "#{pane_pid}" to spawnInTmux() to get real PID immediately
- daemon/run.ts: Use tmux PID in existing pidToAwaiter pattern (exact same as regular sessions)
- Tmux sessions now follow identical flow to regular non-tmux sessions
- No more setTimeout hacks or complex webhook matching needed

Why this is better:
- Uses tmux's native PID retrieval capability (-P flag)
- Follows existing daemon patterns exactly (pidToTrackedSession + pidToAwaiter)
- Minimal changes (only 38 insertions, 27 deletions)
- No race conditions or complex correlation logic
- Happy CLI webhook matches by PID just like regular sessions
- GUI gets real session ID that works for communication

Files affected:
- src/utils/tmux.ts:827-859 - Added -P flag and PID extraction
- src/daemon/run.ts:396-445 - Use tmux PID in existing awaiter pattern

Testable:
- Create session with tmux enabled should work immediately
- Session ID should be real and communication should work
- Should follow same flow as non-tmux sessions
CRITICAL FIX: tmux sessions were missing daemon's environment variables

Previous issue:
- Regular sessions: env = { ...process.env, ...extraEnv }
- Tmux sessions: only got extraEnv variables via -e flags
- Missing daemon's expanded auth variables (e.g., ANTHROPIC_AUTH_TOKEN from Z_AI_AUTH_TOKEN)
- This would cause tmux sessions to fail authentication despite profile configuration

What changed:
- daemon/run.ts: Pass complete environment (process.env + extraEnv) to tmux sessions
- tmux.ts: Updated comment to clarify environment variable inheritance
- Added proper TypeScript filtering for process.env values
- Ensures tmux sessions have identical environment to regular sessions

Why this matters:
- Happy CLI expects ANTHROPIC_AUTH_TOKEN environment variable
- expandEnvironmentVariables() converts Z_AI_AUTH_TOKEN → ANTHROPIC_AUTH_TOKEN in daemon
- Without passing daemon's env, tmux sessions would only get profile's ${VAR} placeholders
- Now tmux sessions work identically to regular sessions for all auth backends

Files affected:
- src/daemon/run.ts:389-399 - Pass complete environment to tmux
- src/utils/tmux.ts:797-825 - Updated environment variable inheritance comment

Testable:
- Z.AI profile in tmux should now have ANTHROPIC_AUTH_TOKEN properly set
- All tmux sessions should work identically to regular sessions
- Environment variable expansion should work in both regular and tmux modes
…available

Previous behavior: CLI crashed with unhandled 404 error when machine registration endpoint missing on server

What changed:
- src/api/api.ts (lines 123-184): wrap axios.post() in try-catch block
- Catch only 404 errors specifically (axios.isAxiosError && status === 404)
- Return local machine object with same encryption keys when 404 occurs
- All other errors (network, auth, 5xx) rethrow unchanged to preserve existing error handling

Why: Server endpoint changed to api.cluster-fluster.com (commit 0e2577c) but new server lacks POST /v1/machines endpoint. Mobile app uses GET /v1/machines (works), but CLI uses POST (returns 404). This enables development to continue when server endpoint unavailable.

Security analysis:
- Encryption keys derived identically in both paths (this.credential.encryption.machineKey)
- Return type unchanged: Promise<Machine> with same fields
- No plaintext exposure (404 path sends nothing to network - more secure)
- MachineMetadata contains only JSON-serializable strings (no data loss from skipping encrypt/decrypt round-trip)
- Version fields (0) are standard initial values for first sync
- dataEncryptionKey skipped (expected - used for remote session sync, not needed when server unavailable)

Caller impact:
- runClaude.ts:68 - return value unused (zero impact)
- runCodex.ts:93 - return value unused (zero impact)
- daemon/run.ts:643 - uses machine.id, encryptionKey, encryptionVariant (all provided identically)

Testable: Run `happy-dev` - should show warning "Machine registration endpoint not available (404)" and continue instead of crashing
…ck mapping

Previous behavior: Codex crashed with undefined approvalPolicy/sandbox when receiving Claude-specific permission modes (bypassPermissions, acceptEdits, plan) from GUI

What changed:
- runClaude.ts (lines 174-189): Map Codex modes to Claude equivalents with defensive fallback
  - yolo → bypassPermissions (full access equivalent)
  - safe-yolo → default (conservative: ask for permissions)
  - read-only → default (Claude lacks read-only, ask for permissions)
  - Unknown modes → default with warning log

- runCodex.ts (lines 67, 146): Use shared PermissionMode type from api/types for cross-agent compatibility

- runCodex.ts (lines 150-158): Remove restrictive validation, accept all permission modes (will be mapped in switch)

- runCodex.ts (lines 616-644): Add defensive fallback cases for Claude modes
  - bypassPermissions → approval='on-failure', sandbox='danger-full-access' (yolo equivalent)
  - acceptEdits → approval='on-request', sandbox='workspace-write' (let model decide)
  - plan → approval='untrusted', sandbox='workspace-write' (conservative)
  - default case for unknown modes

Why: GUI could send incompatible permission modes in edge cases (backward compatibility, saved sessions, manual API calls). Without defensive fallback, Codex received undefined approvalPolicy/sandbox causing session failures. Mappings prioritize safety while preserving closest semantic equivalents.

Semantic mapping rationale:
- bypassPermissions ≈ yolo: Both skip all permissions with full access
- acceptEdits ≈ on-request approval: Let model decide when to ask (closest to auto-approve edits)
- plan ≈ untrusted: Conservative fallback (Codex lacks planning mode)
- safe-yolo ≈ default: Conservative fallback (different failure-handling semantics)
- read-only → default: Claude lacks read-only mode, use safe interactive default

Files affected:
- src/claude/runClaude.ts: Add Codex→Claude mode mapping
- src/codex/runCodex.ts: Add Claude→Codex mode mapping + use shared PermissionMode type

Testable: Send bypassPermissions mode to Codex session - should map to yolo behavior (danger-full-access + on-failure) instead of crashing
….71.0

- Bump claude-code from 2.0.24 to 2.0.53 (latest)
- Bump sdk from 0.65.0 to 0.71.0
- Fix MessageQueue.ts to import SDK types from local module instead of
  @anthropic-ai/claude-code (SDK entrypoint was removed in 2.0.25)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…istence-profile-integration-01WqaAvCxRr6eWW2Wu33e8xP
Claude Code 2.0.64+ rejects --session-id when used with --continue, causing:
"Error: --session-id cannot be used with --continue or --resume"

Solution: Convert --continue to --resume by finding the last session ID.

Changes:
- Add claudeFindLastSession() utility to find most recent valid session
- claudeLocal.ts: Intercept --continue flag and convert to --resume with last session
- Handle all session-related flags (--continue, --resume, --session-id) transparently
- Only UUID-format sessions are returned for --resume compatibility
- Add comprehensive unit tests for the --continue conversion logic

Fixes critical bug where "happy --continue" failed with flag conflict error.
When Happy API server is unreachable, the CLI was crashing with uncaught
exceptions. Now handles connection errors gracefully and continues in offline mode.

Changes:
- api.ts: Add connection error handling (ECONNREFUSED, ENOTFOUND, ETIMEDOUT)
  - getOrCreateSession: Returns null when server unreachable
  - getOrCreateMachine: Returns minimal Machine object when server unreachable
  - Updated return types to reflect null handling
- runClaude.ts, runCodex.ts: Handle null API responses with graceful exit
- Show clear user message: "⚠️ Happy server unreachable - continuing in local mode"
- Add comprehensive unit tests for server error scenarios

This allows users to continue using Happy CLI in local mode even when
the server is temporarily unavailable.
- Merge fix/claude-continue-session-id-conflict branch
- Resolves --continue/--session-id flag conflict in Claude 2.0.64+
- Adds graceful server unreachable handling
- Implements comprehensive error handling for both connection errors and 404s
- Maintains backward compatibility and consistent UX

Files merged:
- src/api/api.ts: Enhanced error handling with connection error detection
- src/api/api.test.ts: Comprehensive test coverage for error scenarios
- src/claude/runClaude.ts: Server unreachable graceful exit
- src/codex/runCodex.ts: Server unreachable graceful exit

Resolves:
- "happy --continue fails with session-id cannot be used error"
- "happy crashes when server is unreachable"
- Fix axios mocking in api.test.ts by creating mock functions before vi.mock()
- Add global test metadata objects to avoid repetition
- Fix constructor access to use static ApiClient.create() method
- Fix socket.io-client mocking in apiSession.test.ts
- Add proper session object with required metadata
- All 8 API tests now pass
When Happy servers are unreachable, Claude/Codex now continue running
locally instead of exiting. Background reconnection attempts use
exponential backoff (5s-60s delay cap) with unlimited retries.

Previous behavior:
- Server unreachable at startup → process.exit(1)
- User loses their work context

What changed:
- src/utils/offlineReconnection.ts: NEW shared utility with:
  - Exponential backoff using existing exponentialBackoffDelay()
  - Unlimited retries (delay caps at 60s, retries continue forever)
  - Auth failure detection (401 stops retrying)
  - Race condition handling (cancel during async ops)
  - Generic TSession type for backend transparency
- src/utils/offlineReconnection.test.ts: NEW 24 comprehensive tests
- src/claude/runClaude.ts: Offline fallback using claudeLocal() with
  hot reconnection via sessionScanner (syncs all JSONL messages)
- src/codex/runCodex.ts: Offline fallback with session stub that
  swaps to real session on reconnection
- src/api/api.ts: Return null on connection errors for graceful handling
- src/api/api.test.ts: Tests for connection error handling

User experience:
- Startup offline: "⚠️ Happy server unreachable - running Claude locally"
- On reconnect: "✅ Reconnected! Session syncing in background."
- Auth failure: "❌ Authentication failed. Please re-authenticate."
…new-session-wizard-ux-improvements-merged

Resolved conflict in src/api/api.test.ts by keeping both:
- 404 endpoint test from offline mode branch
- Non-connection error re-throw test from target branch
Previous behavior: When server was unreachable, three separate warning
messages would print from different call sites (api.getOrCreateSession,
api.getOrCreateMachine, and runClaude/runCodex), resulting in confusing
output like:
  ⚠️  Happy server unreachable - working in offline mode
  ⚠️  Happy server unreachable - working in offline mode
  ⚠️  Happy server unreachable - running Claude locally

What changed:
- offlineReconnection.ts: Added OfflineState class with simple online/offline
  state machine that prints warning ONCE on first offline transition
- offlineReconnection.ts: Added OfflineFailure type with operation, caller,
  errorCode, and url fields for detailed error context
- offlineReconnection.ts: Added ERROR_DESCRIPTIONS map for human-readable
  error code translations (ECONNREFUSED, ETIMEDOUT, etc.)
- api.ts: Changed console.log() to connectionState.fail() with full context
- runClaude.ts, runCodex.ts: Added connectionState.setBackend() before API
  calls, removed redundant printOfflineWarning() calls
- api.test.ts, offlineReconnection.test.ts: Updated assertions to use
  expect.stringContaining() and added connectionState.reset() in beforeEach

New output format shows consolidated warning with actionable details:
  ⚠️  Happy server unreachable - running Claude locally

  Failed:
  • Session creation: server not accepting connections (ECONNREFUSED) [api.getOrCreateSession]

  → Local work continues normally
  → Will reconnect automatically when server available
Previous behavior: When server was unreachable, three separate warning
messages would print from different call sites (api.getOrCreateSession,
api.getOrCreateMachine, and runClaude/runCodex), resulting in confusing
output like:
  ⚠️  Happy server unreachable - working in offline mode
  ⚠️  Happy server unreachable - working in offline mode
  ⚠️  Happy server unreachable - running Claude locally

What changed:
- offlineReconnection.ts: Added OfflineState class with simple online/offline
  state machine that prints warning ONCE on first offline transition
- offlineReconnection.ts: Added OfflineFailure type with operation, caller,
  errorCode, and url fields for detailed error context
- offlineReconnection.ts: Added ERROR_DESCRIPTIONS map for human-readable
  error code translations (ECONNREFUSED, ETIMEDOUT, etc.)
- api.ts: Changed console.log() to connectionState.fail() with full context
- runClaude.ts, runCodex.ts: Added connectionState.setBackend() before API
  calls, removed redundant printOfflineWarning() calls
- api.test.ts, offlineReconnection.test.ts: Updated assertions to use
  expect.stringContaining() and added connectionState.reset() in beforeEach

New output format shows consolidated warning with actionable details:
  ⚠️  Happy server unreachable - running Claude locally

  Failed:
  • Session creation: server not accepting connections (ECONNREFUSED) [api.getOrCreateSession]

  → Local work continues normally
  → Will reconnect automatically when server available
…nErrors

Previous behavior:
- offlineReconnection.ts handled all server errors including 403/409
- 403/409 showed "server unreachable" message (semantically wrong - server responded)
- Lost recovery action: no `happy doctor clean` guidance for re-auth conflicts
- Minimal machine object duplicated 3 times (DRY violation)
- ERROR_DESCRIPTIONS not exported (poor discoverability)
- path.test.ts mocked node:os which leaked to sessionScanner tests

What changed:
- src/utils/offlineReconnection.ts → src/utils/serverConnectionErrors.ts
  - Renamed for accurate description (connection errors, not just offline)
  - Export ERROR_DESCRIPTIONS for discoverability
  - Added `details?: string[]` to OfflineFailure for multi-line context
  - Updated module documentation

- src/api/api.ts
  - Extract createMinimalMachine() helper (DRY - 4 call sites)
  - 403/409 uses direct console.log (NOT connectionState) with recovery action:
    "Run 'happy doctor clean' to reset local state"
  - 5xx uses connectionState.fail() with details for auto-reconnect
  - All HTTP error handling in catch block (axios throws on non-2xx)

- src/claude/utils/path.test.ts
  - Remove vi.mock('node:os') that leaked to other tests
  - Use CLAUDE_CONFIG_DIR env var (code already supports it)
  - Cross-platform compatible, works with both npm and bun

- Updated imports in api.test.ts, runClaude.ts, runCodex.ts

Why:
- 403/409 are server rejections, not "server unreachable" - semantic accuracy
- Users need `happy doctor clean` recovery action for re-auth conflicts
- Exported ERROR_DESCRIPTIONS helps developers find error handling code
- File rename improves discoverability: serverConnectionErrors describes content

Testable:
- All 144 tests pass (0 fail)
- HAPPY_SERVER_URL=http://localhost:59999 happy --print "test"
  Shows: "Machine registration failed: ECONNREFUSED - server not accepting connections"
…Y refactor

Summary: Fixes lost context in machine registration errors by adding specific
recovery instructions and consolidates duplicate code into helper function.

Previous behavior: 403/409 errors showed generic "server unreachable" message
without explaining the cause or providing recovery steps. Minimal machine
object was duplicated 3 times. vi.mock hoisting caused test failures.

What changed:
- src/api/api.ts: Added createMinimalMachine() helper (DRY), 403/409 now shows
  specific message with recovery action ("happy doctor clean"), uses
  isNetworkError() helper instead of inline checks
- src/api/api.test.ts: Fixed vi.mock hoisting with vi.hoisted(), updated
  import path, adjusted test expectations for new message format
- src/claude/claudeLocal.test.ts: Fixed vi.mock hoisting with vi.hoisted()
- src/claude/runClaude.ts, src/codex/runCodex.ts: Updated import paths
- src/claude/utils/path.test.ts: Replaced vi.mock('node:os') with env var
  approach to prevent mock leaking between test files
- Renamed offlineReconnection.ts -> serverConnectionErrors.ts for clarity

Why: The original error message was lost in a previous refactor. Users seeing
403/409 need to know it's a re-auth conflict and run "happy doctor clean".
The vi.hoisted pattern ensures mocks work correctly with vitest.

Files affected:
- src/api/api.ts: 403/409 handling, DRY helper, import path
- src/api/api.test.ts: vi.hoisted fix, test expectations
- src/claude/claudeLocal.test.ts: vi.hoisted fix
- src/claude/runClaude.ts: import path update
- src/codex/runCodex.ts: import path update
- src/claude/utils/path.test.ts: env var approach for test isolation
- src/utils/serverConnectionErrors.ts: renamed from offlineReconnection.ts
- src/utils/serverConnectionErrors.test.ts: renamed

Testable: npm test or bunx vitest run src/api/api.test.ts
Reverts:
- README.md: restore claude-code-router link (line 39)
- package.json: restore version 0.12.0 (was 0.12.0-0)
- src/index.ts: restore --claude-env parsing (lines 271-284)

Fixes:
- src/api/api.test.ts:7-10: apply vi.hoisted pattern for vitest compatibility

The vi.hoisted() wrapper ensures mock variables are available during
vitest's module hoisting phase, fixing "Cannot access 'mockFn' before
initialization" errors.
Reverts:
- README.md: restore claude-code-router link (line 39)
- package.json: restore version 0.12.0 (was 0.12.0-0)
- src/index.ts: restore --claude-env parsing (lines 271-284)

Removes:
- src/api/apiSession.test.ts: belongs in server-unreachable PR, not here

Fixes:
- src/claude/claudeLocal.test.ts:5-8: apply vi.hoisted pattern for vitest compatibility

The vi.hoisted() wrapper ensures mock variables are available during
vitest's module hoisting phase, fixing "Cannot access 'mockFn' before
initialization" errors.

Note: One test (should remove --continue from claudeArgs after conversion)
has a pre-existing logic issue in the implementation - the test expects
the claudeArgs array to be mutated, but the implementation uses array
filtering which creates a new array.
Changes require('../runtime.js') to import from '../runtime'
for TypeScript compatibility. The source file is runtime.ts,
not runtime.js.
Incorporates tiann's improvements from slopus#83:
- stdio suppression for cleaner which/where execution
- Existence check before symlink resolution (safer)
- Fallback to original path if resolution fails

Combined with our branch's additions:
- Source detection (npm, Bun, Homebrew, etc.)
- PATH-first priority per @Enzime's suggestion

Detection priority:
  PATH > npm > Bun > Homebrew > native

Credit: @tiann (slopus#83)
Adds environment variable for explicit Claude CLI path override.

Detection priority:
  HAPPY_CLAUDE_PATH > PATH > npm > Bun > Homebrew > native

This allows users to specify a custom Claude CLI location without
modifying their PATH.
Merges feature/bun-support-claude-cli-detection branch with:
- Bun runtime detection and CLI discovery
- HAPPY_CLAUDE_PATH env var override
- PATH fallback from tiann's PR slopus#83 (stdio suppression, existence check)
- Source detection (npm, Bun, Homebrew, native installer)

Detection priority:
  HAPPY_CLAUDE_PATH > PATH > npm > Bun > Homebrew > native

Also fixes test issues:
- apiSession.test.ts: vi.hoisted pattern for mock hoisting
- claudeLocal.test.ts: check spawn args instead of array mutation
- runtime.test.ts: ESM imports instead of require()

Credit: @tiann (slopus#83)
Previous behavior: Test checked if original claudeArgs array was mutated
to remove --continue, but filter() creates a new array.

What changed: Check mockSpawn.mock.calls[0][1] (actual spawn arguments)
instead of the original claudeArgs array.

Why: The implementation uses filter() which returns a new array rather
than mutating the original. The test should verify what actually gets
passed to spawn, not internal implementation details.
When --resume is passed without a session ID, Happy now delegates to
Claude's native session picker instead of auto-finding the last session.

Changes:
- extractFlag: return {found: false} when value required but missing
- Spawn args: check for --resume in claudeArgs before adding --session-id

This preserves all existing behavior:
- --resume <id>: extracts ID, calls onSessionFound(), passes to Claude
- --continue: still auto-finds last session
- New sessions: still generate UUID

Fixes picker mode for interactive session selection.
Merges the minimal --resume picker fix into the feature branch.
Resolves conflicts in claudeLocal.test.ts by keeping both test changes.
…etOrCreateSession()

Previous behavior:
getOrCreateSession() at src/api/api.ts:120 threw on all HTTP 5xx errors,
causing Happy to crash with "Error: Failed to get or create session: Request
failed with status code 500" instead of continuing in offline mode.

What changed:
- src/api/api.ts:120-132 - Added 5xx error handling before final throw
  - Check if error is axios error with response status >= 500
  - Call connectionState.fail() with error details
  - Return null to enable offline mode with auto-reconnect
  - Matches existing pattern from getOrCreateMachine() lines 234-242
- src/api/api.test.ts:174-218 - Added 3 new test cases
  - Test 500 Internal Server Error returns null
  - Test 503 Service Unavailable returns null
  - Verify existing throw behavior for non-5xx errors

Why:
When Happy server returns transient errors (500, 503, etc.), users should be
able to continue in local mode instead of experiencing a crash. This matches
the established pattern where getOrCreateMachine() already handles 5xx errors
gracefully by returning a minimal machine object.

Files affected:
- src/api/api.ts - getOrCreateSession() error handling (14 lines added)
- src/api/api.test.ts - Test coverage for 5xx errors (46 lines added)

Testing:
- TDD Red: New tests failed with "Failed to get or create session: Unknown error"
- TDD Green: All 9 api.test.ts tests pass after fix
- Full suite: 146 tests pass, no regressions
- Manual: happy --resume with server 500 error continues in offline mode

Impact:
- Users can now use Happy when server has transient errors
- Offline mode with auto-reconnect activates automatically for 5xx errors
- 401/403/400 errors still throw (require user action, not offline-eligible)
- Preserves existing behavior for network errors (ECONNREFUSED, ETIMEDOUT)
Summary: Integrates PR slopus#99 (hook-based session tracking) and PR slopus#98
(Gemini/ACP integration) while preserving all offline mode and robustness
features from the feature branch, plus DRY improvements.

What changed:
- Session hook server system for reliable Claude session ID capture
- Gemini backend as alternative to Claude/Codex with ACP integration
- Fixed memory leak in session.ts (stored interval reference + cleanup())
- Added socket connection check in apiSession.ts
- Created shared offlineSessionStub.ts utility (DRY refactor)
- Added gemini to profile compatibility validation
- Preserved: offline mode, 5xx handling, profile system, tmux integration

Files affected:
- src/claude/runClaude.ts: Hook server + offline fallback integration
- src/claude/session.ts: RAII cleanup, callback system, hookSettingsPath
- src/claude/loop.ts: hookSettingsPath threading
- src/claude/claudeLocal.ts: Hook parameters
- src/claude/claudeRemote.ts: mapToClaudeMode() preserved
- src/api/api.ts: Graceful 5xx handling preserved
- src/api/apiSession.ts: Socket check restored
- src/daemon/run.ts: Gemini added to agent selection
- src/persistence.ts: Added gemini to ProfileCompatibilitySchema
- src/utils/offlineSessionStub.ts: New shared DRY utility
- src/codex/runCodex.ts: Updated to use shared offline stub
- src/gemini/runGemini.ts: Added offline mode with shared stub
- src/agent/*, src/gemini/*: New from main (no conflicts)
- scripts/session_hook_forwarder.cjs: Hook forwarder from main

Testable: npm test (266 pass, 8 fail - baseline unchanged)
Summary: Extract shared patterns between Codex and Gemini backends into
reusable base classes and utilities, eliminating ~886 lines of duplicate
code across permission handlers, reasoning processors, and session setup.

Previous behavior:
- Codex and Gemini had nearly identical permission handlers (~178/244 lines)
- Codex and Gemini had nearly identical reasoning processors (~263/279 lines)
- Session metadata creation duplicated in runCodex.ts and runGemini.ts
- Offline reconnection setup duplicated in both run files
- PermissionMode type duplicated in gemini/types.ts
- Missing connectionState.setBackend('Gemini') call

What changed:
- package.json: Update scripts to use $npm_execpath for package manager portability
- src/utils/BasePermissionHandler.ts: New abstract base class for permission handlers
- src/utils/BaseReasoningProcessor.ts: New abstract base class for reasoning processors
- src/utils/createSessionMetadata.ts: Extract shared session metadata creation
- src/utils/setupOfflineReconnection.ts: Extract offline reconnection pattern
- src/codex/utils/permissionHandler.ts: Simplified to 57 lines extending base
- src/gemini/utils/permissionHandler.ts: Simplified to 141 lines extending base
- src/codex/utils/reasoningProcessor.ts: Simplified to 44 lines extending base
- src/gemini/utils/reasoningProcessor.ts: Simplified to 47 lines extending base
- src/codex/runCodex.ts: Use createSessionMetadata + setupOfflineReconnection
- src/gemini/runGemini.ts: Add connectionState.setBackend + use shared utilities
- src/gemini/types.ts: Remove duplicate PermissionMode, import from @/api/types

Why:
- Single source of truth for permission/reasoning behavior
- Easier to add new backends (just extend base classes)
- Reduced maintenance burden from duplicate code drift
- Package manager portability for npm/yarn/bun users

Files affected:
- 4 new utility files in src/utils/
- 8 modified files with simplified implementations
- Net: 886 lines removed, 667 lines added = 219 lines saved

Testable: yarn build && yarn test (242 passed, 12 skipped)
… issues

Previous behavior:
- Test files using module-scope mock variables failed with "Cannot access
  before initialization" because vi.mock() factories are hoisted above variable
  declarations
- expandEnvVars.test.ts failed with "logger.warn is not a function" due to
  missing logger mock
- claude_version_utils.test.ts failed on macOS due to /tmp symlink to /private/tmp
- Permission handlers held stale session references after onSessionSwap during
  offline reconnection, causing RPC handlers to register on disconnected sessions
- reset() method had potential race condition with concurrent calls and Map
  iteration during mutation

What changed:
- Use vi.hoisted() in api.test.ts, apiSession.test.ts, claudeLocal.test.ts to
  ensure mock functions are available when vi.mock() factories execute
- Add logger mock with all methods (debug, warn, info, error) to expandEnvVars.test.ts
- Use fs.realpathSync() in claude_version_utils.test.ts to normalize macOS symlinks
- Add updateSession() method to BasePermissionHandler for session reference updates
- Update onSessionSwap callbacks in runCodex.ts and runGemini.ts to call
  permissionHandler.updateSession() when session is swapped
- Add isResetting guard to reset() for idempotent behavior
- Snapshot Map entries before iteration in reset() to prevent mutation during loop

Why:
- Vitest hoists vi.mock() factories to top of file, requiring vi.hoisted() for
  mock function declarations that the factory references
- Logger mock needed because expandEnvVars imports and uses logger
- macOS creates /tmp as symlink to /private/tmp, causing path comparison failures
- After offline reconnection, onSessionSwap swaps the session but permission
  handlers still referenced the old disconnected session
- Concurrent cleanup operations could cause undefined behavior in reset()

Files affected:
- src/api/api.test.ts: vi.hoisted() for mockPost, mockIsAxiosError
- src/api/apiSession.test.ts: vi.hoisted() for mockIo
- src/claude/claudeLocal.test.ts: vi.hoisted() for mockSpawn, mockClaudeFindLastSession
- src/utils/expandEnvVars.test.ts: Add vi.mock for @/ui/logger
- scripts/claude_version_utils.test.ts: Use realpathSync for path comparison
- src/utils/BasePermissionHandler.ts: Add updateSession(), idempotent reset()
- src/codex/runCodex.ts: Call permissionHandler.updateSession() in onSessionSwap
- src/gemini/runGemini.ts: Call permissionHandler.updateSession() in onSessionSwap

Test results: 242 passed, 12 skipped (daemon integration tests skip when server unreachable)
@ahundt
Copy link
Author

ahundt commented Dec 30, 2025

@bra1nDump ok I've merged it, I'll be testing it out for a day or two.

@bra1nDump
Copy link
Contributor

Thank you! Okay let me know. I am planning to do maintenance this coming weekend - merge this, release the new CLI, release the new mobile app

Did you get a chance to add a feature flag for the new chat composer change in app?

@ahundt
Copy link
Author

ahundt commented Dec 30, 2025

not yet for the feature flag, hopefully a bit later this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants