-
-
Notifications
You must be signed in to change notification settings - Fork 50
feat: Added voice feature for end-user fix #325 #405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds a push-to-talk voice input subsystem (faster-whisper-based) with a new VoiceInputHandler, CLI integration (voice command and --mic), Windows-specific console fallbacks, voice docs and tests, optional Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as CortexCLI
participant VIH as VoiceInputHandler
participant Audio as Audio System
participant Model as Whisper Model
User->>CLI: run `cortex voice` or pass `--mic`
CLI->>VIH: get_voice_handler(model?) / ensure dependencies
VIH->>Audio: check microphone availability
Audio-->>VIH: device OK
Note over User,VIH: Push-to-talk or single-shot
User->>Audio: press hotkey and speak
Audio->>VIH: stream audio chunks
User->>Audio: release hotkey
VIH->>VIH: stop recording, assemble buffer
VIH->>Model: lazy load & transcribe(audio)
Model-->>VIH: text/segments
VIH->>CLI: deliver transcription callback
CLI->>User: display / run install/ask flow
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CLA Verification PassedAll contributors have signed the CLA.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds voice input capability to Cortex Linux, enabling users to speak commands instead of typing them. The implementation uses local speech-to-text processing with faster-whisper for privacy and low latency. However, there are several critical issues that need to be addressed before merging.
Key Changes
- New voice input module with support for continuous and single-shot voice commands
- Optional
[voice]dependency group in pyproject.toml - Windows compatibility improvements in branding module (ASCII fallback for console output)
- Comprehensive documentation for voice features
- 20 unit tests for voice functionality
Reviewed changes
Copilot reviewed 12 out of 42 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_voice.py | Unit tests for voice input handler with mocked dependencies |
| requirements.txt | Issue: Voice dependencies incorrectly added as required instead of optional |
| pyproject.toml | Proper optional voice dependency group configuration |
| myenv/* | Critical: Entire virtual environment directory committed (should be excluded) |
| docs/VOICE_INPUT.md | Comprehensive user documentation for voice features |
| cortex/branding.py | Windows compatibility with ASCII fallback for special characters |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
requirements.txt (1)
9-25: Duplicate PyYAML dependencies with inconsistent versions.There are three occurrences of PyYAML in this file:
- Line 9:
PyYAML>=6.0.0- Line 21:
pyyaml>=6.0.0- Line 25:
PyYAML==6.0.3This creates confusion and potential version conflicts. Keep only one entry with a consistent version constraint.
🔎 Proposed fix
# Configuration PyYAML>=6.0.0 # Environment variable loading from .env files python-dotenv>=1.0.0 # Encryption for environment variable secrets cryptography>=42.0.0 # Terminal UI rich>=13.0.0 -# Configuration -pyyaml>=6.0.0 - # Type hints for older Python versions typing-extensions>=4.0.0 -PyYAML==6.0.3
♻️ Duplicate comments (4)
myenv/Scripts/deactivate.bat (1)
1-22: Part of virtual environment that should not be committed.This file is part of the
myenv/virtual environment directory that should be removed from version control. See the comment onmyenv/pyvenv.cfgfor details.myenv/share/man/man1/isympy.1 (1)
1-188: Third-party package artifact that should not be committed.This is a SymPy man page installed in the virtual environment. It's part of the
myenv/directory that should be removed from version control. See the comment onmyenv/pyvenv.cfgfor details.myenv/Scripts/activate.bat (1)
1-34: Part of virtual environment with hardcoded developer paths.This file contains a hardcoded path (
C:\Users\sahil\...) on line 11 and is part of themyenv/virtual environment that should be removed. See the comment onmyenv/pyvenv.cfgfor details.myenv/Scripts/Activate.ps1 (1)
1-528: Part of virtual environment that should not be committed.This PowerShell activation script is part of the
myenv/virtual environment directory that should be removed from version control. See the comment onmyenv/pyvenv.cfgfor details.
🧹 Nitpick comments (4)
docs/VOICE_INPUT.md (1)
66-83: Add language identifier to fenced code blocks for markdown lint compliance.Static analysis flagged missing language specifiers. For terminal output and diagrams, use
textorconsoleas the language identifier.🔎 Suggested fix
Line 66:
-``` +```text $ cortex voiceLine 146:
-``` +```text ┌──────────────┐ ┌──────────────┐cortex/voice.py (3)
296-307: Recording indicator bypasses branding utilities.The recording indicator at line 302 uses raw string formatting
" CX | "instead of thecx_printfunction fromcortex/branding.py. This creates inconsistency in terminal output, especially on Windows wherecx_printuses ASCII-only icons.🔎 Suggested approach
Consider using
console.printwith Rich markup for consistency, or document why direct stdout is necessary (e.g., for\rcarriage return updates).If carriage return updates are required, you could use Rich's
Livecontext orStatusfor animated updates instead of raw stdout manipulation.
425-428: Busy-wait loop could be replaced with event-based waiting.The infinite loop with
time.sleep(0.1)is a busy-wait pattern. While functional, it wastes CPU cycles polling. Consider using anEvent.wait()pattern that blocks until signaled.🔎 Alternative approach
+ self._exit_event = threading.Event() + try: # Keep the main thread alive - while True: - time.sleep(0.1) + self._exit_event.wait() except KeyboardInterrupt: cx_print("\nVoice mode exited.", "info")Then set
self._exit_event.set()in thestop()method.
482-487: Bare except clause silently swallows all exceptions.The bare
exceptat line 485 catches and ignores all exceptions, includingKeyboardInterruptandSystemExit. This could mask programming errors during debugging.🔎 Suggested fix
if self._hotkey_listener: try: self._hotkey_listener.stop() - except Exception: - pass + except OSError: + # Listener may already be stopped + pass self._hotkey_listener = NoneOr log the exception at debug level:
except Exception as e: logging.debug("Error stopping hotkey listener: %s", e)
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (29)
myenv/Scripts/coloredlogs.exeis excluded by!**/*.exemyenv/Scripts/cortex.exeis excluded by!**/*.exemyenv/Scripts/ct2-fairseq-converter.exeis excluded by!**/*.exemyenv/Scripts/ct2-marian-converter.exeis excluded by!**/*.exemyenv/Scripts/ct2-openai-gpt2-converter.exeis excluded by!**/*.exemyenv/Scripts/ct2-opennmt-py-converter.exeis excluded by!**/*.exemyenv/Scripts/ct2-opennmt-tf-converter.exeis excluded by!**/*.exemyenv/Scripts/ct2-opus-mt-converter.exeis excluded by!**/*.exemyenv/Scripts/ct2-transformers-converter.exeis excluded by!**/*.exemyenv/Scripts/distro.exeis excluded by!**/*.exemyenv/Scripts/dotenv.exeis excluded by!**/*.exemyenv/Scripts/f2py.exeis excluded by!**/*.exemyenv/Scripts/hf.exeis excluded by!**/*.exemyenv/Scripts/httpx.exeis excluded by!**/*.exemyenv/Scripts/humanfriendly.exeis excluded by!**/*.exemyenv/Scripts/isympy.exeis excluded by!**/*.exemyenv/Scripts/markdown-it.exeis excluded by!**/*.exemyenv/Scripts/numpy-config.exeis excluded by!**/*.exemyenv/Scripts/onnxruntime_test.exeis excluded by!**/*.exemyenv/Scripts/openai.exeis excluded by!**/*.exemyenv/Scripts/pip.exeis excluded by!**/*.exemyenv/Scripts/pip3.12.exeis excluded by!**/*.exemyenv/Scripts/pip3.exeis excluded by!**/*.exemyenv/Scripts/pyav.exeis excluded by!**/*.exemyenv/Scripts/pygmentize.exeis excluded by!**/*.exemyenv/Scripts/python.exeis excluded by!**/*.exemyenv/Scripts/pythonw.exeis excluded by!**/*.exemyenv/Scripts/tiny-agents.exeis excluded by!**/*.exemyenv/Scripts/tqdm.exeis excluded by!**/*.exe
📒 Files selected for processing (13)
cortex/branding.pycortex/cli.pycortex/voice.pydocs/VOICE_INPUT.mdmyenv/Scripts/Activate.ps1myenv/Scripts/activatemyenv/Scripts/activate.batmyenv/Scripts/deactivate.batmyenv/pyvenv.cfgmyenv/share/man/man1/isympy.1pyproject.tomlrequirements.txttests/test_voice.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
tests/test_voice.pycortex/voice.pycortex/branding.pycortex/cli.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_voice.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}
📄 CodeRabbit inference engine (AGENTS.md)
Use Python 3.10 or higher as the minimum supported version
Files:
pyproject.toml
🧬 Code graph analysis (3)
tests/test_voice.py (1)
cortex/voice.py (9)
VoiceInputHandler(39-495)_ensure_dependencies(87-125)_check_microphone(153-175)transcribe(226-270)stop(477-495)VoiceInputError(21-24)MicrophoneNotFoundError(27-30)ModelNotFoundError(33-36)get_voice_handler(498-517)
cortex/voice.py (1)
cortex/branding.py (1)
cx_print(52-82)
cortex/cli.py (1)
cortex/voice.py (4)
VoiceInputError(21-24)VoiceInputHandler(39-495)start_voice_mode(399-432)record_single(434-475)
🪛 GitHub Actions: CI
cortex/cli.py
[error] 580-580: Ruff check failed: W293 Blank line contains whitespace. Command: 'ruff check . --output-format=github'.
🪛 GitHub Check: lint
cortex/cli.py
[failure] 591-591: Ruff (W293)
cortex/cli.py:591:1: W293 Blank line contains whitespace
[failure] 588-588: Ruff (W293)
cortex/cli.py:588:1: W293 Blank line contains whitespace
[failure] 580-580: Ruff (W293)
cortex/cli.py:580:1: W293 Blank line contains whitespace
🪛 GitHub Check: Lint
cortex/cli.py
[failure] 591-591: Ruff (W293)
cortex/cli.py:591:1: W293 Blank line contains whitespace
[failure] 588-588: Ruff (W293)
cortex/cli.py:588:1: W293 Blank line contains whitespace
[failure] 580-580: Ruff (W293)
cortex/cli.py:580:1: W293 Blank line contains whitespace
🪛 LanguageTool
docs/VOICE_INPUT.md
[grammar] ~141-~141: Ensure spelling is correct
Context: ...apture** - Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribe...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~141-~141: Ensure spelling is correct
Context: ... Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribes using `fa...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
docs/VOICE_INPUT.md
66-66: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
146-146: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Agent
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.10)
🔇 Additional comments (14)
cortex/branding.py (3)
8-14: Good Windows compatibility improvements for Rich console.The platform-aware Console initialization with
force_terminal=Trueandlegacy_windowsbased on platform detection is a solid approach for cross-platform terminal rendering.
63-79: Well-structured platform-specific icon fallbacks.The ASCII fallback icons for Windows (
|,+,!,x,*) are appropriate replacements for the Unicode characters that may not render correctly in Windows terminals. The conditional structure is clean and maintainable.
92-102: Consistent platform-specific separators.The separator adjustments in
cx_stepandcx_headeralign with the icon changes above, ensuring consistent visual appearance across platforms.pyproject.toml (1)
72-79: Voice optional dependencies are correctly structured.The optional dependency group is properly defined with appropriate version constraints compatible with Python 3.10+. Including it in the
allextra is appropriate for comprehensive installation.docs/VOICE_INPUT.md (1)
1-46: Well-structured documentation with comprehensive coverage.The documentation covers all essential aspects: installation, usage modes (continuous and single), configuration options, troubleshooting, privacy considerations, and API reference. The privacy-first approach (local processing, no audio uploads) is well highlighted.
tests/test_voice.py (1)
10-11: Good test organization with clear class-based grouping.The test suite is well-organized into logical test classes (
TestVoiceInputHandler,TestVoiceInputExceptions,TestGetVoiceHandler,TestRecordingState), each with focused test methods and appropriate fixtures. The mocking strategy for optional dependencies is sound.Also applies to: 199-200, 224-225, 268-269
cortex/cli.py (3)
540-552: Good implementation with proper dependency handling.The voice method correctly handles the optional dependency import with a helpful error message guiding users to install the voice extras. The API key and provider checks follow the established pattern in the codebase.
560-577: Consider case-insensitive verb matching and edge cases.The install verb detection uses
startswith()after lowercasing, which is good. However, the verb removal at lines 574-577 operates on the originaltextbut useslen(verb)from the lowercase version, which works correctly since length is preserved. The logic is sound.One edge case: if the user says "Install", the slicing
text[len(verb):]correctly preserves the original casing of the software name.
1633-1638: Help table correctly advertises new voice capabilities.The Rich help table now includes voice command entries, making the feature discoverable to users.
cortex/voice.py (5)
21-36: Well-designed exception hierarchy.The exception classes are properly structured with a base
VoiceInputErrorand specific subclasses for different error conditions. Docstrings are present as required by coding guidelines. Thepassstatements are appropriate for simple exception classes.
51-85: Good initialization with sensible defaults and type hints.The
__init__method has proper type hints, comprehensive docstrings, and sensible defaults. The environment variable fallback formodel_nameprovides flexibility. Internal state management with explicit initialization is clean.
226-270: Transcription method is well-implemented with proper validation.The
transcribemethod correctly:
- Handles lazy model loading (line 238-239)
- Returns early for empty audio (line 241-242)
- Validates dtype before processing (line 245-246)
- Uses appropriate VAD parameters for voice commands
- Strips and joins segment texts properly
498-517: Factory function provides clean API.The
get_voice_handlerfactory function has proper type hints and docstring. It provides a simpler interface for common use cases while still allowing customization.
177-202: Audio recording implementation is robust.The
_start_recordingmethod properly:
- Clears the buffer before starting
- Uses a callback-based approach for non-blocking capture
- Sets up appropriate stream parameters (16kHz, mono, float32)
- Handles errors by resetting state and raising a specific exception
|
@Sahilbhatane remove all those virtual environment files. You should never push those. |
yeah ik, .gitignore didn't recognized them ig. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
cortex/voice.py (1)
486-490: Improve exception handling in hotkey listener cleanup.The bare
except Exception: passsilently swallows all errors. Following the pattern used at line 498 for the audio stream, add debug logging to track potential issues during cleanup.🔎 Proposed fix
if self._hotkey_listener: try: self._hotkey_listener.stop() - except Exception: - pass + except Exception as e: + logging.debug("Error stopping hotkey listener: %s", e) self._hotkey_listener = Nonecortex/cli.py (3)
554-558: Unusedprovidervariable.The
providervariable is retrieved at line 558 but never used in thevoicemethod. Theprocess_voice_commandcallback already retrieves the API key and usesself.ask()andself.install(), which internally get the provider themselves.🔎 Consider removing unused variable
api_key = self._get_api_key() if not api_key: return 1 - - provider = self._get_provider()
579-600: Fix trailing whitespace to resolve pipeline failures.Based on past review comments, lines 580, 588, and 591 contain trailing whitespace on blank lines, causing CI/linter failures. These must be fixed for the PR to pass.
🔎 Remove trailing whitespace
Ensure lines 580, 588, and 591 are completely empty (no spaces or tabs):
cx_print(f"Installing: {software}", "info") - + # Ask user for confirmation console.print() console.print("[bold cyan]Choose an action:[/bold cyan]") console.print(" [1] Dry run (preview commands)") console.print(" [2] Execute (run commands)") console.print(" [3] Cancel") console.print() - + try: choice = input("Enter choice [1/2/3]: ").strip() - + if choice == "1":
2000-2015: Add error handling for VoiceInputError and its subclasses.The
install --micflow catchesImportErrorbut doesn't handleVoiceInputError,MicrophoneNotFoundError, orModelNotFoundError. If voice recording fails (e.g., no microphone detected, model loading fails), an unhandled exception will propagate.🔎 Proposed fix
if getattr(args, "mic", False): try: - from cortex.voice import VoiceInputHandler + from cortex.voice import VoiceInputError, VoiceInputHandler handler = VoiceInputHandler() cx_print("Press F9 to speak what you want to install...", "info") software = handler.record_single() if not software: cx_print("No speech detected.", "warning") return 1 cx_print(f"Installing: {software}", "info") except ImportError: cli._print_error("Voice dependencies not installed.") cx_print("Install with: pip install cortex-linux[voice]", "info") return 1 + except VoiceInputError as e: + cli._print_error(str(e)) + return 1
🧹 Nitpick comments (1)
.gitignore (1)
14-20: Consolidate duplicate entries for maintainability.The
.gitignorefile contains numerous redundant entries. For example:
env/,venv/,ENV/appear at lines 14–15, 143–145.mypy_cache/,.pytest_cache/,.coverage,htmlcov/appear at lines 70–71 and 146–149This duplication reduces readability and makes future maintenance harder.
🔎 Suggested cleanup: Remove duplicates and consolidate sections
# ============================== # Logs & Misc # ============================== *.log logs/ *.tmp *.bak *.swp - .env - .venv - env/ - venv/ - ENV/ - .mypy_cache/ - .pytest_cache/ - .coverage - htmlcov/ *.out *~ *.swoThen, verify that all non-duplicate entries are already covered in their respective sections above (Virtual Environments, mypy/Pyre/pytype, etc.).
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (23)
myenv/Scripts/bandit-baseline.exeis excluded by!**/*.exemyenv/Scripts/bandit-config-generator.exeis excluded by!**/*.exemyenv/Scripts/bandit.exeis excluded by!**/*.exemyenv/Scripts/black.exeis excluded by!**/*.exemyenv/Scripts/blackd.exeis excluded by!**/*.exemyenv/Scripts/coverage-3.12.exeis excluded by!**/*.exemyenv/Scripts/coverage.exeis excluded by!**/*.exemyenv/Scripts/coverage3.exeis excluded by!**/*.exemyenv/Scripts/dmypy.exeis excluded by!**/*.exemyenv/Scripts/mypy.exeis excluded by!**/*.exemyenv/Scripts/mypyc.exeis excluded by!**/*.exemyenv/Scripts/nltk.exeis excluded by!**/*.exemyenv/Scripts/normalizer.exeis excluded by!**/*.exemyenv/Scripts/pip.exeis excluded by!**/*.exemyenv/Scripts/pip3.12.exeis excluded by!**/*.exemyenv/Scripts/pip3.exeis excluded by!**/*.exemyenv/Scripts/py.test.exeis excluded by!**/*.exemyenv/Scripts/pytest.exeis excluded by!**/*.exemyenv/Scripts/ruff.exeis excluded by!**/*.exemyenv/Scripts/safety.exeis excluded by!**/*.exemyenv/Scripts/stubgen.exeis excluded by!**/*.exemyenv/Scripts/stubtest.exeis excluded by!**/*.exemyenv/Scripts/typer.exeis excluded by!**/*.exe
📒 Files selected for processing (6)
.gitignorecortex/cli.pycortex/voice.pymyenv/share/man/man1/bandit.1tests/test_ollama_integration.pytests/test_voice.py
✅ Files skipped from review due to trivial changes (1)
- myenv/share/man/man1/bandit.1
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_voice.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
tests/test_ollama_integration.pycortex/voice.pycortex/cli.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_ollama_integration.py
🧬 Code graph analysis (3)
tests/test_ollama_integration.py (1)
scripts/setup_ollama.py (1)
check_ollama_installed(75-77)
cortex/voice.py (1)
cortex/branding.py (1)
cx_print(52-82)
cortex/cli.py (1)
cortex/voice.py (4)
VoiceInputError(21-24)VoiceInputHandler(39-499)start_voice_mode(401-436)record_single(438-479)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (Python 3.12)
- GitHub Check: Test (Python 3.10)
- GitHub Check: Test (Python 3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.10)
- GitHub Check: test (3.11)
🔇 Additional comments (10)
tests/test_ollama_integration.py (1)
15-15: LGTM: Clean refactor toshutil.which.Replacing the subprocess-based Ollama detection with
shutil.whichis a cleaner, more direct approach that aligns with the pattern used inscripts/setup_ollama.py(lines 74-76).Also applies to: 29-30, 37-37
cortex/voice.py (5)
1-36: LGTM: Clean module structure and exception hierarchy.Module docstring clearly describes the feature, and the exception hierarchy provides appropriate granularity for error handling.
39-86: LGTM: Well-structured initialization with environment-driven configuration.Good use of lazy loading for the model and environment variable overrides. The type hints and docstring meet coding guidelines.
87-270: LGTM: Robust dependency checks and transcription pipeline.The dependency validation provides helpful user guidance, and the transcription implementation uses appropriate parameters (beam_size=5, VAD filtering, no_speech_threshold=0.6).
272-436: LGTM: Thread-safe hotkey handling with good UX.The implementation properly uses threading primitives (locks, events) and provides visual feedback during recording. KeyboardInterrupt handling ensures clean exit.
502-521: LGTM: Clean factory function with proper type hints.The factory function follows Python conventions and provides a clean entry point for creating voice handlers.
cortex/cli.py (4)
1633-1633: LGTM: Help documentation clearly advertises voice features.The help table updates appropriately document the new voice input capabilities for users.
Also applies to: 1638-1638
1722-1739: LGTM: Well-structured argument parsing for voice features.The
--micand--singleflags are appropriately configured, and making the question argument optional enables the voice input flow.
1743-1755: LGTM: Consistent --mic flag integration.The install command's
--micflag follows the same pattern as the ask command, maintaining consistency across the CLI.
1989-1998: LGTM: Clean command routing for voice features.The routing logic appropriately delegates to the voice method for both explicit voice commands and
--micflags, with helpful error messages.
| myenv/ | ||
| venv312/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for committed virtual environment files in the current HEAD and recent commits
echo "=== Checking git log for myenv and venv312 directories ==="
git log --all --oneline --full-history -- myenv/ venv312/ 2>&1 | head -20
echo -e "\n=== Checking git status ==="
git status --short | grep -E '(myenv|venv312)' || echo "No myenv/venv312 in current status"
echo -e "\n=== Checking HEAD commit for these files ==="
git ls-tree -r HEAD | grep -E '(myenv|venv312)' || echo "No myenv/venv312 files in HEAD"
echo -e "\n=== Checking recent commits (last 5) ==="
git log --oneline -5
echo -e "\n=== Checking if myenv and venv312 are untracked/ignored ==="
git check-ignore -v myenv/ venv312/ 2>&1 || echo "Checking ignore patterns..."
echo -e "\n=== Listing actual directories in repository root ==="
ls -la | grep -E '(myenv|venv|env)' || echo "No venv directories found in root"Repository: cortexlinux/cortex
Length of output: 5200
Critical: Virtual environment files committed to git; .gitignore change is incomplete.
The myenv/ directory with virtual environment artifacts (activation scripts, pyvenv.cfg, executables, man pages) is already committed in this PR. Adding myenv/ and venv312/ to .gitignore prevents future commits but does not remove these files from git history.
These directories must be removed from the commit before merging. Use git rm -r --cached myenv/ venv312/ and amend the commit, or rebase to exclude them entirely. Without this cleanup, the repository will include unnecessary virtual environment files that violate the project's policy of never committing local development environments.
🤖 Prompt for AI Agents
In .gitignore around lines 16-17, the review notes that myenv/ and venv312/ were
committed despite being ignored; remove those directories from the repository
index and rewrite the offending commit(s): run git rm -r --cached myenv/
venv312/ to untrack them, commit the removal (or git commit --amend if you want
to fix the latest commit), and if the directories exist in earlier commits use
an interactive rebase or filter-branch/BFG to purge them from history; ensure
.gitignore retains myenv/ and venv312/ so they are not re-added, then push with
force-if-rewriting-history and notify reviewers.
- Remove unused 'provider' variable in cli.py - Add logging to except block in voice.py stop() - Remove unused 'threading' import in test_voice.py - Improve test_ensure_dependencies_missing test - Fix test_transcribe_loads_model_if_needed to test lazy loading - Add VoiceInputError handling to install --mic path - Remove optional voice deps from requirements.txt (use pyproject.toml)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
cortex/voice.py (2)
185-189: Consider using logging for audio status messages.Line 187 prints audio status directly to stderr. For consistency with the rest of the module (which uses
cx_printandlogging.debug), consider usinglogging.debug()instead.🔎 Proposed refactor
def audio_callback(indata, frames, time_info, status): if status: - print(f"Audio status: {status}", file=sys.stderr) + logging.debug("Audio status: %s", status) if self._is_recording: self._audio_buffer.append(indata.copy())
238-250: Redundant model check after lazy loading.Lines 248-250 check if
self._model is Noneafter calling_load_model()on line 239. However,_load_model()either setsself._modelor raisesModelNotFoundError, so the model can never beNoneat line 249. This check is dead code.🔎 Proposed refactor
if self._model is None: self._load_model() if len(audio_data) == 0: return "" # faster-whisper expects float32 audio normalized to [-1, 1] if audio_data.dtype != np.float32: audio_data = audio_data.astype(np.float32) - # Model should be loaded at this point - if self._model is None: - raise ModelNotFoundError("Model must be loaded before transcription") - segments, info = self._model.transcribe(docs/VOICE_INPUT.md (1)
66-66: Consider adding language identifiers to code blocks.The fenced code blocks at lines 66 and 146 are missing language identifiers. While not critical, adding them improves syntax highlighting and documentation quality.
🔎 Suggested additions
For line 66 (example session output):
-``` +```console $ cortex voiceFor line 146 (ASCII diagram):
-``` +```text ┌──────────────┐ ┌──────────────┐ ┌──────────────┐Also applies to: 146-157
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cortex/cli.pycortex/voice.pydocs/VOICE_INPUT.mdrequirements.txttests/test_voice.py
🚧 Files skipped from review as they are similar to previous changes (2)
- requirements.txt
- tests/test_voice.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/cli.pycortex/voice.py
🧬 Code graph analysis (2)
cortex/cli.py (3)
cortex/voice.py (4)
VoiceInputError(21-24)VoiceInputHandler(39-499)start_voice_mode(401-436)record_single(438-479)cortex/first_run_wizard.py (1)
_print_error(746-748)cortex/branding.py (1)
cx_print(52-82)
cortex/voice.py (1)
cortex/branding.py (1)
cx_print(52-82)
🪛 LanguageTool
docs/VOICE_INPUT.md
[grammar] ~141-~141: Ensure spelling is correct
Context: ...apture** - Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribe...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~141-~141: Ensure spelling is correct
Context: ... Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribes using `fa...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
docs/VOICE_INPUT.md
66-66: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
146-146: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (10)
cortex/voice.py (3)
339-383: LGTM! Thread-safe hotkey handling.The hotkey setup properly uses a lock for thread safety and correctly sets the recording flag before starting threads. The logic handles both press-to-start and press-to-stop flows correctly.
385-436: LGTM! Proper error handling and resource cleanup.The recording worker and continuous voice mode implementation correctly handle errors and ensure resources are cleaned up via the
finallyblock.
481-499: LGTM! Exception handling improved.The cleanup logic properly handles exceptions during shutdown and logs them for debugging. This addresses the past review comment about silent exception handling.
docs/VOICE_INPUT.md (2)
49-49: LGTM! Model size documentation is now consistent.The documentation correctly states that the default
base.enmodel is ~150MB, which is consistent with the table at line 117. This addresses the past review comment about the documentation inconsistency.
1-261: LGTM! Comprehensive and well-structured documentation.The documentation provides excellent coverage of the voice input feature, including installation, usage, configuration, troubleshooting, and API reference. The structure is clear and user-friendly.
cortex/cli.py (5)
540-625: LGTM! Well-designed voice input integration.The
voice()method properly handles both continuous and single-shot modes, includes user confirmation for installations, and has comprehensive error handling for missing dependencies and voice input errors.
1989-1996: LGTM! Clean integration of voice input with ask command.The
--micflag integration properly routes to the voice handler and provides clear error messages when neither a question nor the mic flag is provided.
1998-2027: LGTM! VoiceInputError handling properly implemented.The install command's
--micintegration now correctly imports and catchesVoiceInputError(lines 2001, 2014-2016), addressing the past review comment about missing error handling. The implementation provides clear error messages and proper fallback behavior.
1728-1753: LGTM! Parser configuration is correct.The new voice command and
--micflag integrations are properly configured with clear help text and appropriate argument handling. Makingsoftwareoptional (line 1741) correctly supports the--micworkflow.
1987-1988: LGTM! Voice command routing is correct.The routing logic correctly maps the
--singleflag to thecontinuousparameter by negating it, so that by default (no flag) continuous mode is active, and with--singleit switches to single-shot mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @cortex/cli.py:
- Around line 2656-2674: VoiceInputHandler is instantiated in the --mic install
path but handler.stop() isn't guaranteed to run if an exception occurs; wrap
creation and usage of VoiceInputHandler (the handler = VoiceInputHandler();
software = handler.record_single(); cx_print(...) flow) in a try/finally (or use
a context manager if available) and call handler.stop() in the finally block so
resources are always cleaned up even on exceptions (ensure you still catch
ImportError and VoiceInputError as before and only return after stop() is
called).
In @requirements.txt:
- Around line 22-24: Remove the duplicated dependency and stray comment by
deleting the repeated "# Configuration" comment and the duplicate
"PyYAML>=6.0.0" entry (the second occurrence) so only the original comment and
single "PyYAML>=6.0.0" remain in requirements.txt; ensure no other duplicate
package lines exist to avoid being listed twice by setup.py.
🧹 Nitpick comments (3)
cortex/voice.py (2)
347-360: Consider terminal width for line clearing.The hardcoded 70 spaces (line 360) may not fully clear the line on wider terminals or may wrap on narrower ones. This is a minor cosmetic issue.
♻️ Optional improvement
# Clear the line - console.print(" " * 70, end="\r") + import shutil + cols = shutil.get_terminal_size().columns + console.print(" " * min(cols - 1, 70), end="\r")
362-388: Consider adding return type hint.The
_get_hotkey_keymethod could benefit from a return type annotation for clarity, though it's a private method.♻️ Optional improvement
- def _get_hotkey_key(self): + def _get_hotkey_key(self) -> Any: """Get the pynput key object for the configured hotkey."""Or more precisely with the actual type when pynput is available.
tests/test_cli_extended.py (1)
43-52: LGTM! The addedPath.cwdpatch aligns with Ollama local key resolution.The change correctly patches both
Path.homeandPath.cwdfor this specific test case where environment variables are cleared and detection falls through to interactive mode. Other tests (test_get_api_key_openai,test_get_api_key_claude) don't need thePath.cwdpatch since they set environment variables that short-circuit before reaching that code path.Optional: Consider using
contextlib.ExitStackor combining patches to reduce nesting depth (4 levels here), though this is consistent with the file's existing style.♻️ Optional: Flatten nested context managers
def test_get_api_key_not_found(self) -> None: # When no API key is set and user selects Ollama, falls back to Ollama local mode from cortex.api_key_detector import PROVIDER_MENU_CHOICES - with patch.dict(os.environ, {}, clear=True): - with patch("pathlib.Path.home", return_value=self._temp_home): - with patch("pathlib.Path.cwd", return_value=self._temp_home): - with patch("builtins.input", return_value=PROVIDER_MENU_CHOICES["ollama"]): - api_key = self.cli._get_api_key() - self.assertEqual(api_key, "ollama-local") + with ( + patch.dict(os.environ, {}, clear=True), + patch("pathlib.Path.home", return_value=self._temp_home), + patch("pathlib.Path.cwd", return_value=self._temp_home), + patch("builtins.input", return_value=PROVIDER_MENU_CHOICES["ollama"]), + ): + api_key = self.cli._get_api_key() + self.assertEqual(api_key, "ollama-local")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.gitignorecortex/cli.pycortex/voice.pyrequirements.txtsetup.pytests/integration/test_end_to_end.pytests/test_api_key_detector.pytests/test_cli.pytests/test_cli_extended.pytests/test_ollama_integration.pytests/test_voice.py
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/test_voice.py
- .gitignore
- tests/test_ollama_integration.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
tests/test_api_key_detector.pytests/test_cli_extended.pycortex/cli.pytests/test_cli.pysetup.pytests/integration/test_end_to_end.pycortex/voice.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_api_key_detector.pytests/test_cli_extended.pytests/test_cli.pytests/integration/test_end_to_end.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}
📄 CodeRabbit inference engine (AGENTS.md)
Use Python 3.10 or higher as the minimum supported version
Files:
setup.py
🧬 Code graph analysis (4)
tests/test_api_key_detector.py (1)
cortex/api_key_detector.py (1)
detect(77-123)
tests/test_cli_extended.py (1)
cortex/cli.py (1)
_get_api_key(130-152)
tests/test_cli.py (1)
cortex/cli.py (1)
_get_api_key(130-152)
cortex/voice.py (1)
cortex/branding.py (1)
cx_print(52-82)
🔇 Additional comments (14)
requirements.txt (1)
26-40: Informational comment block looks good.The documentation for the optional voice feature dependencies and system requirements is helpful for developers who want to enable voice input.
tests/integration/test_end_to_end.py (1)
20-22: LGTM!The
PIP_BOOTSTRAP_DEVconstant is well-defined and appropriately used intest_project_tests_run_inside_containerto install dev dependencies needed for pytest.cortex/cli.py (3)
637-743: Voice method implementation is well-structured.The
voice()method properly handles:
- Dependency import failures with helpful error messages
- API key validation
- Continuous vs single-shot modes
- Input validation for install commands (length, dangerous characters)
- Resource cleanup in the
finallyblock
744-901: Refactored install helpers improve maintainability.The extraction of
_handle_parallel_execution,_handle_sequential_execution, and related utilities from theinstall()method reduces complexity and improves readability. Error handling is consistent throughout.
2253-2286: Voice CLI integration is well-designed.The
voicesubcommand with--singleflag and--micflags onask/installprovide flexible voice input options. Help text clearly describes the functionality.cortex/voice.py (6)
19-34: Exception hierarchy is well-designed.The exception classes provide clear, specific error types for voice input failures with appropriate inheritance from
VoiceInputError.
49-83: Initialization is thorough with sensible defaults.Environment variable support for
CORTEX_WHISPER_MODELprovides flexibility. Threading primitives are properly initialized.
206-273: Recording implementation handles resources correctly.Good practices observed:
- Memory protection with ~60 second buffer limit (line 221)
- Proper stream cleanup in
_stop_recording_stream- Setting
_is_recording = Falsebefore stopping stream prevents race conditions- Buffer cleared in
finallyblock to prevent memory leaks
275-321: Transcription method is well-configured.The VAD (Voice Activity Detection) parameters and
condition_on_previous_text=Falsesetting are appropriate for command-style speech. Lazy model loading improves startup time.
390-434: Hotkey handling uses proper synchronization.The
recording_lockcorrectly serializes access to the recording state, preventing race conditions when the hotkey is pressed rapidly. The daemon threads ensure cleanup on program exit.
532-550: Cleanup method is robust.The
stop()method properly handles multiple cleanup scenarios with appropriate exception handling, preventing errors from leaking if resources are already cleaned up.tests/test_cli.py (1)
46-49: Test isolation improvement is correct.Adding the
pathlib.Path.cwdpatch ensures the test doesn't accidentally find.envfiles in the actual current working directory, improving test reliability and isolation.tests/test_api_key_detector.py (1)
162-166: Test isolation fix is consistent with other test changes.Patching
pathlib.Path.cwdalongsidePath.homeensures complete isolation from the real filesystem, preventing false positives if the test runner's directory contains.envfiles.setup.py (1)
48-55: Voice extras definition is correct and properly synced across configuration files.The optional
voicedependency group is properly structured with appropriate version constraints in both setup.py and pyproject.toml, and is correctly included in theallextra for comprehensive dependency installation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
cortex/cli.py (4)
679-683: Consider expanding dangerous character validation.The current list covers common shell metacharacters but misses some injection vectors like newlines (
\n,\r), backslashes (\), and quotes (',"). For defense-in-depth, consider using an allowlist pattern instead.♻️ Suggested improvement
- # Check for dangerous characters that shouldn't be in package names - dangerous_chars = [";", "&", "|", "`", "$", "(", ")"] - if any(char in software for char in dangerous_chars): - cx_print("Invalid characters detected in software name", "error") - return + # Check for dangerous characters that shouldn't be in package names + dangerous_chars = [";", "&", "|", "`", "$", "(", ")", "\n", "\r", "\\", "'", '"'] + if any(char in software for char in dangerous_chars): + cx_print("Invalid characters detected in software name", "error") + return
695-706: Consider usingconsole.inputfor consistent styling.The prompt uses
input()while other parts of the CLI useconsole.input()from Rich for styled prompts. This inconsistency may result in visual differences.♻️ Optional: Use Rich console for consistent styling
try: - choice = input("Enter choice [1/2/3]: ").strip() + choice = console.input("[bold cyan]Enter choice [1/2/3]: [/bold cyan]").strip()
744-753: Hardcoded special case is a code smell.This normalization logic embeds a specific package combination directly in code. Consider moving such mappings to a configuration file or the existing
stacks.jsonfor maintainability.
846-849: Fragile status comparison using string value.Using
getattr(t.status, "value", "") == "failed"is fragile. Ift.statusis an enum, compare directly against the enum member for type safety.♻️ Suggested improvement
def _get_parallel_error_msg(self, parallel_tasks: list) -> str: """Extract error message from failed parallel tasks.""" - failed_tasks = [t for t in parallel_tasks if getattr(t.status, "value", "") == "failed"] + from cortex.coordinator import StepStatus + failed_tasks = [t for t in parallel_tasks if t.status == StepStatus.FAILED] return failed_tasks[0].error if failed_tasks else self.INSTALL_FAIL_MSG
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cortex/cli.pyrequirements.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- requirements.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/cli.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (Python 3.10)
- GitHub Check: Test (Python 3.12)
- GitHub Check: Test (Python 3.11)
- GitHub Check: test (3.11)
- GitHub Check: test (3.10)
- GitHub Check: test (3.12)
🔇 Additional comments (11)
cortex/cli.py (11)
10-10: LGTM!Import is correctly placed with other cortex module imports.
39-41: LGTM!Good extraction of a repeated string literal into a class constant for maintainability.
154-175: LGTM!Comment clarifications improve code readability without changing functionality.
712-742: LGTM!Proper resource cleanup pattern with
handler = Noneinitialization before try block and cleanup in finally. The nested try/except in the finally block correctly prevents cleanup errors from masking the original exception.
765-827: LGTM!Good error handling with specific exception types (ValueError, OSError) and a catch-all for unexpected errors. The parallel execution logic correctly propagates errors to history.
851-901: LGTM!Sequential execution handler is well-structured with proper progress callbacks and comprehensive error recording.
903-978: LGTM!Good refactoring of the install method. The extraction of parallel and sequential execution into helper methods improves readability and maintainability. Return type annotation adds clarity.
2146-2152: LGTM!Help table appropriately documents the new voice input features with clear descriptions.
2253-2270: LGTM!Voice command parser is well-designed with
--singleflag providing flexibility between continuous and single-shot modes. The optional question argument properly supports the--micflag workflow.
2656-2694: LGTM!The
--micflag handling for install is correctly structured:
- Proper try/finally for resource cleanup
- Early return on voice detection failure
- Variable
softwareis correctly set in both branches before use at line 2689
2078-2078: LGTM!Consistent use of the
INSTALL_FAIL_MSGclass constant across error handlers.
|
Added selection mode for user where they can choose the model they want for there system, also stating clearly small model = low accuracy during speech to text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @cortex/cli.py:
- Around line 761-770: The method _normalize_software_name currently returns a
raw shell command for the exact normalized input "pytorch-cpu jupyter numpy
pandas", which bypasses LLM-based interpretation and can inject unsafe operators
like &&; remove this hardcoded branch so _normalize_software_name always returns
a natural-language software description (or a canonical token), and instead
implement any predefined install stacks inside the LLM/system that generates
commands (not in this normalizer). Update callers (e.g., where
interpreter.parse() is invoked) to feed the natural-language description and, if
you need a safe predefined stack, add a separate mapping function or
config-driven preset that produces validated, per-command entries which then
pass through validate_install_request() and _validate_commands() before
execution.
In @README.md:
- Line 67: Update the broken documentation link in the README: replace the
reference to docs/VOICE_MODELS.md with the correct target docs/VOICE_INPUT.md
for the "Voice Input" entry (the table row containing "**Voice Input** |
Hands-free mode with Whisper speech recognition ([F9 to speak](...))") so the
link points to the existing voice documentation file.
🧹 Nitpick comments (6)
docs/COMMANDS.md (1)
11-11: Missing detailed documentation section forcortex voicecommand.The Quick Reference entry is added, but unlike other commands (
cortex install,cortex stack, etc.), there's no corresponding detailed section in the Commands documentation below. Consider adding a section with:
- Usage examples (
cortex voice,cortex voice --single,--modeloptions)- Options table (
--single,--model,--micflag for install/ask)- Environment variables (
CORTEX_WHISPER_MODEL,CORTEX_VOICE_HOTKEY)- Notes about dependencies (
pip install cortex-linux[voice])This aligns with the existing documentation pattern and the detailed docs in
docs/VOICE_INPUT.md.cortex/voice.py (4)
162-174: Progress bar provides misleading feedback.The progress bar is created with
total=None(indeterminate) but the actual model download happens insideWhisperModel()constructor without progress callbacks. The bar doesn't reflect real download progress.Consider either:
- Removing the progress bar wrapper since it doesn't track actual progress
- Using faster-whisper's download callbacks if available
- Simply showing a spinner or "Downloading..." message instead
♻️ Simpler approach without misleading progress
- # Show download progress with progress bar - from rich.progress import Progress - - with Progress() as progress: - task = progress.add_task( - f"[cyan]Downloading {self.model_name}...", - total=None, - ) - - self._model = WhisperModel( - self.model_name, - device="cpu", - compute_type="int8", - download_root=self.model_dir, - ) - progress.update(task, completed=True) + from rich.status import Status + + with Status(f"[cyan]Downloading {self.model_name}...[/cyan]"): + self._model = WhisperModel( + self.model_name, + device="cpu", + compute_type="int8", + download_root=self.model_dir, + )
224-232: Consider thread safety for audio buffer access.The
audio_callbackruns in a separate thread (sounddevice's audio thread) and modifies_audio_bufferwithout synchronization. While CPython's GIL makeslist.append()atomic, this is an implementation detail that shouldn't be relied upon.For robustness, consider using
queue.Queuewhich is explicitly thread-safe:♻️ Thread-safe alternative
+import queue + class VoiceInputHandler: def __init__(self, ...): ... - self._audio_buffer: list[Any] = [] + self._audio_buffer: queue.Queue = queue.Queue(maxsize=937) # ~60 sec ... def audio_callback(indata, frames, time_info, status): if status: logging.debug("Audio status: %s", status) if self._is_recording: - if len(self._audio_buffer) < 60 * self.sample_rate // 1024: - self._audio_buffer.append(indata.copy()) - else: + try: + self._audio_buffer.put_nowait(indata.copy()) + except queue.Full: self._stop_recording.set()
311-322: Language is hardcoded to English.The
language="en"parameter is hardcoded, but the module supports multilingual models (tiny, base, small, medium, large without.ensuffix). Users selecting a multilingual model might expect language auto-detection or the ability to specify a different language.Consider making this configurable:
♻️ Add language configuration
def __init__( self, model_name: str | None = None, sample_rate: int = 16000, hotkey: str = "f9", model_dir: str | None = None, + language: str | None = "en", ): + self.language = language # None for auto-detection ... # In transcribe(): segments, _ = self._model.transcribe( audio_data, beam_size=5, - language="en", + language=self.language, ... )
540-558:stop()doesn't wait for recording thread to terminate.The method signals stop and cleans up resources, but doesn't
join()the_recording_thread. This could cause issues if the caller immediately tries to start a new recording or if cleanup timing matters.♻️ Wait for thread completion
def stop(self) -> None: """Stop the voice input handler and clean up resources.""" self._is_recording = False self._stop_recording.set() + # Wait for recording thread to finish + if self._recording_thread is not None: + try: + self._recording_thread.join(timeout=2.0) + except Exception as e: + logging.debug("Error joining recording thread: %s", e) + self._recording_thread = None + if self._hotkey_listener: ...cortex/cli.py (1)
863-866: String comparison for task status may be fragile.The code compares
getattr(t.status, "value", "") == "failed". This assumes the status enum's value is the string "failed". Consider using the enum directly ifStepStatus.FAILEDis available:failed_tasks = [t for t in parallel_tasks if t.status == StepStatus.FAILED]This would be more robust against enum value changes.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.mdcortex/cli.pycortex/voice.pydocs/COMMANDS.md
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/voice.pycortex/cli.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Documentation required for new features
Applied to files:
README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test (Python 3.11)
- GitHub Check: test (3.11)
- GitHub Check: test (3.10)
- GitHub Check: test (3.12)
🔇 Additional comments (10)
cortex/voice.py (5)
19-34: LGTM!Good exception hierarchy with clear, descriptive names. The base
VoiceInputErrorallows callers to catch all voice-related errors uniformly.
49-84: LGTM!Well-structured initialization with proper type hints, environment variable support for configuration, and clear separation of recording state variables. The lazy loading pattern for the Whisper model is appropriate.
355-368: LGTM!The recording indicator provides good visual feedback. The animated dots give clear indication that recording is active.
415-438: LGTM!Good thread synchronization with
recording_lock. The pattern of setting_is_recording = Truebefore spawning threads, with cleanup in the finally block, correctly handles edge cases.
561-580: LGTM!Clean factory function with proper type hints and documentation. Provides a simple entry point for most use cases.
cortex/cli.py (5)
653-655: API key check happens before voice capture begins.The API key is validated before capturing any voice input. If the user doesn't have an API key configured, they'll only find out after invoking the command, not after speaking. Consider deferring the check or providing a clearer message that voice mode requires API configuration for processing commands.
However, this is arguably correct behavior - failing fast before the user records anything if the subsequent action will fail anyway.
692-700: Good input validation, consider expanding dangerous character list.The validation catches common shell metacharacters. Consider also blocking:
- Newlines (
\n,\r) - could inject additional commands- Backslash (
\) - escape sequences- Redirection (
<,>) - file operationsHowever, since
validate_install_requestin the install path provides additional validation, this is defense-in-depth.
2691-2728: LGTM - Voice input handling for install command.The inline VoiceInputHandler usage for
--micflag is appropriate since it only needs to capture the software name, not process the full voice command flow. The try/except/finally pattern ensures proper cleanup.Minor duplication of import error handling with
cli.voice()is acceptable given the different use cases.
2682-2688: LGTM!Good reuse of the
voice()method forask --mic. The voice command's internal routing will correctly treat non-install phrases as questions.
2279-2304: LGTM!Comprehensive argument parser for the voice command with appropriate model choices matching the VoiceInputHandler's supported models.
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sahilbhatane Demonstrating video and AI usage is missing, also kindly resolve all coderabbit comments.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
README.md (1)
64-75: README hotkey text should reflect configurability (default F9).The feature row says “F9 to speak”, but the PR supports a configurable hotkey (via env var). Consider wording like “Default F9 (configurable)” to avoid stale docs. Based on learnings, documentation is required for new features.
Proposed tweak
-| **Voice Input** | Hands-free mode with Whisper speech recognition ([F9 to speak](docs/VOICE_INPUT.md)) | +| **Voice Input** | Hands-free mode with Whisper speech recognition (default F9; configurable — see [Voice Input](docs/VOICE_INPUT.md)) |cortex/api_key_detector.py (1)
125-147: Prevent “explicit provider” from returning a mismatched key (can break auth).With
CORTEX_PROVIDERexplicitly set, these functions can still fall back to the other provider’s key. In the current CLI flow, that can pair an OpenAI provider with an Anthropic key (or vice versa), causing confusing runtime failures.Proposed fix (make explicit provider strict in these checks)
def _check_environment_api_keys(self) -> tuple[bool, str, str, str] | None: @@ explicit_provider = os.environ.get("CORTEX_PROVIDER", "").lower() @@ if explicit_provider in ["openai", "claude"]: @@ value = os.environ.get(target_env_var) if value: return (True, value, target_provider, "environment") + # Explicit provider set but corresponding key missing: do not fall back to other providers. + return None @@ # Otherwise check all providers in default order for env_var, provider in ENV_VAR_PROVIDERS.items(): value = os.environ.get(env_var) if value: return (True, value, provider, "environment") return Nonedef _check_encrypted_storage(self) -> tuple[bool, str, str, str] | None: @@ explicit_provider = os.environ.get("CORTEX_PROVIDER", "").lower() if explicit_provider in ["openai", "claude"]: @@ value = env_mgr.get_variable(app="cortex", key=target_env_var, decrypt=True) if value: @@ return ( True, value, target_provider, "encrypted storage (~/.cortex/environments/)", ) + # Explicit provider set but corresponding key missing: do not fall back to other providers. + return None @@ # Check for API keys in encrypted storage (default order) for env_var, provider in ENV_VAR_PROVIDERS.items(): value = env_mgr.get_variable(app="cortex", key=env_var, decrypt=True) if value: os.environ[env_var] = value logger.debug(f"Loaded {env_var} from encrypted storage") return (True, value, provider, "encrypted storage (~/.cortex/environments/)")Also applies to: 160-186
🤖 Fix all issues with AI agents
In @cortex/cli.py:
- Around line 637-760: voice() currently fetches api_key once but
process_voice_command() calls self.install() and self.ask(), which internally
call _get_api_key() again so continuous mode can re-prompt; fix by capturing the
session api_key (and provider if applicable) after _get_api_key() and pass it
into install/ask via new optional parameters or thin wrappers: add
_install_with_session_key(software, api_key, provider, execute=False,
dry_run=False) and _ask_with_session_key(prompt, api_key, provider) that call
internal logic without calling _get_api_key(), update process_voice_command() to
call these wrappers (or call install/ask with the new api_key/provider args),
and ensure VoiceInputHandler usage in voice() remains unchanged so cleanup still
calls handler.stop().
🧹 Nitpick comments (1)
cortex/cli.py (1)
2161-2170: CLI surface additions for voice/mic look consistent; consider consolidating voice handler lifecycle.
- Help + argparse wiring for
voiceand--micis coherent (Line 2163-2169, 2269-2320) and routing matches the flags (Line 2678-2685, 2690-2728).- Small maintainability win: the
install --microute duplicates the voice-dependency import + handler lifecycle thatvoice()already owns (Line 2692-2718). Consider extracting a shared_record_single_utterance()helper to keep behavior consistent (model selection, error messaging, cleanup).Also applies to: 2269-2320, 2678-2729
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
README.mdcortex/api_key_detector.pycortex/cli.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/cli.pycortex/api_key_detector.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Documentation required for new features
Applied to files:
README.md
🧬 Code graph analysis (2)
cortex/cli.py (4)
cortex/validators.py (1)
validate_install_request(117-144)cortex/voice.py (4)
VoiceInputHandler(37-558)start_voice_mode(460-495)record_single(497-538)stop(540-558)cortex/branding.py (1)
cx_print(52-82)cortex/install_parallel.py (1)
run_parallel_install(127-279)
cortex/api_key_detector.py (1)
cortex/env_manager.py (2)
get_variable(740-767)decrypt(516-534)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build Package
- GitHub Check: test (3.11)
- GitHub Check: test (3.10)
- GitHub Check: test (3.12)
🔇 Additional comments (1)
cortex/cli.py (1)
771-917: Install flow improvements look solid; asyncio.run() usage is safe in standalone CLI context.
- Good:
validate_install_request()gating prevents injection risk and enforces input constraints (empty/length/shell patterns) before LLM parsing.- Good: centralized
INSTALL_FAIL_MSGimproves consistency across error paths.- Safe:
_handle_parallel_execution()usesasyncio.run()(line 802) correctly. All call sites toinstall()are strictly synchronous: main entry point ismain()(line 2201), which is not async, and no async context invokes it. CLI is always a standalone process.
| def _get_provider(self) -> str: | ||
| # Check environment variable for explicit provider choice | ||
| # 1. Check explicit provider override FIRST (highest priority) | ||
| explicit_provider = os.environ.get("CORTEX_PROVIDER", "").lower() | ||
| if explicit_provider in ["ollama", "openai", "claude", "fake"]: | ||
| self._debug(f"Using explicit CORTEX_PROVIDER={explicit_provider}") | ||
| return explicit_provider | ||
|
|
||
| # Use provider from auto-detection (set by _get_api_key) | ||
| # 2. Use provider from auto-detection (set by _get_api_key) | ||
| detected = getattr(self, "_detected_provider", None) | ||
| if detected == "anthropic": | ||
| return "claude" | ||
| elif detected == "openai": | ||
| return "openai" | ||
|
|
||
| # Check env vars (may have been set by auto-detect) | ||
| # 3. Check env vars (may have been set by auto-detect) | ||
| if os.environ.get("ANTHROPIC_API_KEY"): | ||
| return "claude" | ||
| elif os.environ.get("OPENAI_API_KEY"): | ||
| return "openai" | ||
|
|
||
| # Fallback to Ollama for offline mode | ||
| # 4. Fallback to Ollama for offline mode | ||
| return "ollama" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Provider selection can diverge from the detected key when CORTEX_PROVIDER is set.
_get_provider() always returns explicit CORTEX_PROVIDER (Line 155-159), even if setup_api_key() ended up finding/returning the other provider’s key. This becomes a hard-to-debug auth failure (wrong SDK + wrong key).
Suggested direction
- # 1. Check explicit provider override FIRST (highest priority)
+ # 1. Check explicit provider override FIRST (highest priority),
+ # but ensure it is consistent with the detected key/provider from setup_api_key().
explicit_provider = os.environ.get("CORTEX_PROVIDER", "").lower()
if explicit_provider in ["ollama", "openai", "claude", "fake"]:
self._debug(f"Using explicit CORTEX_PROVIDER={explicit_provider}")
return explicit_providerI’d fix this primarily in cortex/api_key_detector.py by making explicit-provider lookup strict (so it can’t return the “other” provider’s key), and add a defensive assertion/warning here if a mismatch still occurs.
| def voice(self, continuous: bool = False, model: str | None = None) -> int: | ||
| """Handle voice input mode. | ||
| Args: | ||
| continuous: If True, stay in voice mode until Ctrl+C. | ||
| If False, record single input and exit. | ||
| model: Whisper model name (e.g., 'base.en', 'small.en'). | ||
| If None, uses CORTEX_WHISPER_MODEL env var or 'base.en'. | ||
| """ | ||
| try: | ||
| from cortex.voice import VoiceInputError, VoiceInputHandler | ||
| except ImportError: | ||
| self._print_error("Voice dependencies not installed.") | ||
| cx_print("Install with: pip install cortex-linux[voice]", "info") | ||
| return 1 | ||
|
|
||
| api_key = self._get_api_key() | ||
| if not api_key: | ||
| return 1 | ||
|
|
||
| # Display model information if specified | ||
| if model: | ||
| model_info = { | ||
| "tiny.en": "(39 MB, fastest, good for clear speech)", | ||
| "base.en": "(140 MB, balanced speed/accuracy)", | ||
| "small.en": "(466 MB, better accuracy)", | ||
| "medium.en": "(1.5 GB, high accuracy)", | ||
| "tiny": "(39 MB, multilingual)", | ||
| "base": "(290 MB, multilingual)", | ||
| "small": "(968 MB, multilingual)", | ||
| "medium": "(3 GB, multilingual)", | ||
| "large": "(6 GB, best accuracy, multilingual)", | ||
| } | ||
| cx_print(f"Using Whisper model: {model} {model_info.get(model, '')}", "info") | ||
|
|
||
| def process_voice_command(text: str) -> None: | ||
| """Process transcribed voice command.""" | ||
| if not text: | ||
| return | ||
|
|
||
| # Determine if this is an install command or a question | ||
| text_lower = text.lower().strip() | ||
| is_install = any( | ||
| text_lower.startswith(word) for word in ["install", "setup", "add", "get", "put"] | ||
| ) | ||
|
|
||
| if is_install: | ||
| # Remove the command verb for install | ||
| software = text | ||
| for verb in ["install", "setup", "add", "get", "put"]: | ||
| if text_lower.startswith(verb): | ||
| software = text[len(verb) :].strip() | ||
| break | ||
|
|
||
| # Validate software name | ||
| if not software or len(software) > 200: | ||
| cx_print("Invalid software name", "error") | ||
| return | ||
|
|
||
| # Check for dangerous characters that shouldn't be in package names | ||
| dangerous_chars = [";", "&", "|", "`", "$", "(", ")"] | ||
| if any(char in software for char in dangerous_chars): | ||
| cx_print("Invalid characters detected in software name", "error") | ||
| return | ||
|
|
||
| cx_print(f"Installing: {software}", "info") | ||
|
|
||
| # Ask user for confirmation | ||
| console.print() | ||
| console.print("[bold cyan]Choose an action:[/bold cyan]") | ||
| console.print(" [1] Dry run (preview commands)") | ||
| console.print(" [2] Execute (run commands)") | ||
| console.print(" [3] Cancel") | ||
| console.print() | ||
|
|
||
| try: | ||
| choice = input("Enter choice [1/2/3]: ").strip() | ||
|
|
||
| if choice == "1": | ||
| self.install(software, execute=False, dry_run=True) | ||
| elif choice == "2": | ||
| cx_print("Executing installation...", "info") | ||
| self.install(software, execute=True, dry_run=False) | ||
| else: | ||
| cx_print("Cancelled.", "info") | ||
| except (KeyboardInterrupt, EOFError): | ||
| cx_print("\nCancelled.", "info") | ||
| else: | ||
| # Treat as a question | ||
| cx_print(f"Question: {text}", "info") | ||
| self.ask(text) | ||
|
|
||
| handler = None | ||
| try: | ||
| handler = VoiceInputHandler(model_name=model) | ||
|
|
||
| if continuous: | ||
| # Continuous voice mode | ||
| handler.start_voice_mode(process_voice_command) | ||
| else: | ||
| # Single recording mode | ||
| text = handler.record_single() | ||
| if text: | ||
| process_voice_command(text) | ||
| else: | ||
| cx_print("No speech detected.", "warning") | ||
|
|
||
| return 0 | ||
|
|
||
| except VoiceInputError as e: | ||
| self._print_error(str(e)) | ||
| return 1 | ||
| except KeyboardInterrupt: | ||
| cx_print("\nVoice mode exited.", "info") | ||
| return 0 | ||
| finally: | ||
| # Ensure cleanup even if exceptions occur | ||
| if handler is not None: | ||
| try: | ||
| handler.stop() | ||
| except Exception as e: | ||
| # Log cleanup errors but don't raise | ||
| logging.debug("Error during voice handler cleanup: %s", e) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid re-running API key setup on every voice command (continuous mode UX + correctness).
voice() gets an API key once (Line 653-656), but process_voice_command() then calls self.install() / self.ask() (Line 716-720, 727-728), both of which call _get_api_key() again. In continuous mode this can repeatedly prompt, and can also re-trigger provider detection in the middle of a session.
Suggested refactor (keep session key/provider and reuse)
- api_key = self._get_api_key()
+ api_key = self._get_api_key()
if not api_key:
return 1
+ provider = self._get_provider()
@@
- self.install(software, execute=False, dry_run=True)
+ self._install_with_session_key(
+ software, api_key=api_key, provider=provider, execute=False, dry_run=True
+ )
elif choice == "2":
cx_print("Executing installation...", "info")
- self.install(software, execute=True, dry_run=False)
+ self._install_with_session_key(
+ software, api_key=api_key, provider=provider, execute=True, dry_run=False
+ )
@@
- self.ask(text)
+ self._ask_with_session_key(text, api_key=api_key, provider=provider)(Where _install_with_session_key / _ask_with_session_key are thin wrappers that bypass _get_api_key().)
🤖 Prompt for AI Agents
In @cortex/cli.py around lines 637 - 760, voice() currently fetches api_key once
but process_voice_command() calls self.install() and self.ask(), which
internally call _get_api_key() again so continuous mode can re-prompt; fix by
capturing the session api_key (and provider if applicable) after _get_api_key()
and pass it into install/ask via new optional parameters or thin wrappers: add
_install_with_session_key(software, api_key, provider, execute=False,
dry_run=False) and _ask_with_session_key(prompt, api_key, provider) that call
internal logic without calling _get_api_key(), update process_voice_command() to
call these wrappers (or call install/ask with the new api_key/provider args),
and ensure VoiceInputHandler usage in voice() remains unchanged so cleanup still
calls handler.stop().
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sahilbhatane Kindly resolve conflicts and address coderabbitai comments.




Related Issue
Closes #325
Summary
Adds voice input capability to Cortex Linux, enabling users to speak commands instead of typing them. This implementation uses local speech-to-text processing with Whisper for privacy and low latency.
Features Added
cortex voicefor continuous voice inputcortex voice --singlefor one-shot recordingcortex install --micandcortex ask --micCORTEX_VOICE_HOTKEY)● Recording...) during speech captureTechnical Details
faster-whisper(optimized Whisper) for accurate, local STTbase.en(~150MB, good accuracy/speed balance)pip install cortex-linux[voice]Files Changed
cortex/voice.py- New voice input handler modulecortex/cli.py- Added voice commands and --mic flagscortex/branding.py- Windows ASCII fallback for console outputpyproject.toml- Added[voice]optional dependenciesdocs/VOICE_INPUT.md- User documentationtests/test_voice.py- Unit tests (20 tests)tests/test_ollama_integration.py- Fixed Windows compatibilityChecklist
pytest tests/)ruff check,black --check)bandit -r cortex/)docs/VOICE_INPUT.md)additionally for Wayland based Ubuntu user (if hotkey doesn't work) -
Summary by CodeRabbit
New Features
UX
Documentation
Dependencies
Tests
✏️ Tip: You can customize this high-level summary in your review settings.