-
-
Notifications
You must be signed in to change notification settings - Fork 50
[feat] Add Ollama integration with setup script, LLM router support, and documentation. #377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…prehensive documentation Resolves cortexlinux#357
|
Warning Rate limit exceeded@sujay-d07 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 17 minutes and 35 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds Ollama as a local LLM provider to Cortex: configuration, router integration (sync/async), interpreter adjustments for Ollama calls and parsing, setup automation, tests, and comprehensive docs. No changes to exported public APIs beyond added enum/methods and environment keys. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant LLMRouter
participant Ollama as Ollama Service
participant Fallback as Fallback Provider
participant Stats
Client->>LLMRouter: complete(messages, task_type)
LLMRouter->>LLMRouter: Check Ollama availability
alt Ollama available
LLMRouter->>Ollama: POST /v1/chat/completions (OpenAI-compatible)
Ollama-->>LLMRouter: Response + token usage
LLMRouter->>Stats: Record requests/tokens (cost $0)
LLMRouter-->>Client: Return LLMResponse (Ollama)
else Ollama unavailable & fallback configured
LLMRouter->>Fallback: Route completion request
Fallback-->>LLMRouter: Response + tokens + cost
LLMRouter->>Stats: Record fallback metrics
LLMRouter-->>Client: Return LLMResponse (Fallback)
else No fallback
LLMRouter-->>Client: Raise RuntimeError with guidance
end
sequenceDiagram
participant User
participant SetupScript as setup_ollama.py
participant System
participant OllamaService as Ollama Service
participant CortexConfig as ~/.cortex/config.json
User->>SetupScript: run script
SetupScript->>System: which ollama / check binary
alt Not installed
SetupScript->>System: run installer (curl | bash)
System-->>SetupScript: install result
end
SetupScript->>OllamaService: health check
alt Service not running
SetupScript->>OllamaService: start service
OllamaService-->>SetupScript: ready
end
SetupScript->>OllamaService: list available models
SetupScript->>User: prompt select model
User-->>SetupScript: chosen model
SetupScript->>OllamaService: pull model
OllamaService-->>SetupScript: download complete
SetupScript->>OllamaService: test prompt
OllamaService-->>SetupScript: response
SetupScript->>CortexConfig: save {api_provider, ollama_model, ollama_base_url}
CortexConfig-->>SetupScript: saved
SetupScript-->>User: setup complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (9)
OLLAMA_QUICKSTART.md (1)
124-132: Consider adding error handling to the "Quick Win" script.The one-liner chains commands with
&&, which will stop on first failure. However, ifcortex installfails, the user won't know which step failed. Consider adding explicit error checks or suggesting users run steps individually for troubleshooting.Alternative with better error visibility
# Complete setup step-by-step with error checking python scripts/setup_ollama.py || { echo "Setup failed"; exit 1; } export CORTEX_PROVIDER=ollama cortex install nginx --dry-run || { echo "Cortex test failed"; exit 1; } echo "✅ Ollama is working!"cortex/llm/interpreter.py (2)
80-91: Bareexcept Exceptionsilently swallows all errors.While ignoring config read errors is reasonable for graceful fallback, the broad
except Exceptionwith a silentpasscan hide unexpected issues (e.g., permission errors, malformed JSON). Consider logging at debug level or narrowing the exception type.Also, Line 85 opens the file without an explicit encoding, which can cause issues on systems with non-UTF-8 defaults.
Proposed fix
try: from pathlib import Path config_file = Path.home() / ".cortex" / "config.json" if config_file.exists(): - with open(config_file) as f: + with open(config_file, encoding="utf-8") as f: config = json.load(f) model = config.get("ollama_model") if model: return model - except Exception: - pass # Ignore errors reading config + except (OSError, json.JSONDecodeError) as e: + # Log at debug level for troubleshooting, but continue with default + import logging + logging.getLogger(__name__).debug(f"Could not read Ollama config: {e}")
297-301: Uselogginginstead ofPrinting directly to stderr is not configurable and may clutter output in production. The rest of the codebase uses structured error handling; this should use the logging module for consistency.
Proposed fix
except (json.JSONDecodeError, ValueError) as e: # Log the problematic content for debugging - import sys - - print(f"\nDebug: Failed to parse JSON. Raw content:\n{content[:500]}", file=sys.stderr) + import logging + logging.getLogger(__name__).debug( + f"Failed to parse JSON. Raw content: {content[:500]}" + ) raise ValueError(f"Failed to parse LLM response: {str(e)}")tests/test_ollama_integration.py (3)
130-131: Remove meaninglessassert True.
assert Truenever fails and provides no value. The actual verification is the absence of exceptions and the print statement.print(" ✓ Routing logic works") - assert True # Test passed + # Test passed if no exception was raised
21-22: Consider using a pytest conftest.py instead of sys.path manipulation.While this works, manipulating
sys.pathis fragile. Aconftest.pyat the repo root or proper package installation (pip install -e .) is more robust.
66-106: Tests require live Ollama service — document or mark appropriately.These integration tests require a running Ollama instance with a specific model. Consider adding a pytest marker (e.g.,
@pytest.mark.integrationor@pytest.mark.requires_ollama) so they can be skipped in CI environments without Ollama.import pytest @pytest.mark.integration @pytest.mark.requires_ollama def test_llm_router(): # ...Also applies to: 138-166
scripts/setup_ollama.py (2)
354-371: Add explicit encoding to file operations.The
open()calls should specifyencoding="utf-8"for cross-platform compatibility. As per coding guidelines, this is a best practice for Python file I/O.Proposed fix
if config_file.exists(): try: - with open(config_file) as f: + with open(config_file, encoding="utf-8") as f: config = json.load(f) except Exception: pass # ... config updates ... try: - with open(config_file, "w") as f: + with open(config_file, "w", encoding="utf-8") as f: json.dump(config, f, indent=2)
145-169: Popen process may become orphaned.If the script exits unexpectedly between starting
ollama serveand the polling loop completing, the subprocess may be left running. This is likely acceptable behavior (the service should keep running), but consider documenting this.cortex/llm_router.py (1)
179-191: Consider narrowing exception handling during Ollama client initialization.Catching all exceptions means issues like
ImportError(if OpenAI package is missing) are silently logged as warnings rather than surfaced clearly. This could confuse users who have OpenAI installed but Ollama client fails for a different reason.Consider more specific exception handling
try: self.ollama_client = OpenAI( api_key="ollama", # Ollama doesn't need a real key base_url=f"{self.ollama_base_url}/v1", ) self.ollama_client_async = AsyncOpenAI( api_key="ollama", base_url=f"{self.ollama_base_url}/v1", ) logger.info(f"✅ Ollama client initialized ({self.ollama_model})") - except Exception as e: + except (ImportError, ValueError, TypeError) as e: logger.warning(f"⚠️ Could not initialize Ollama client: {e}")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
.env.exampleOLLAMA_QUICKSTART.mdREADME.mdcortex/env_loader.pycortex/llm/interpreter.pycortex/llm_router.pydocs/LLM_INTEGRATION.mddocs/OLLAMA_FIX.mddocs/OLLAMA_INTEGRATION_SUMMARY.mddocs/OLLAMA_SETUP.mddocs/TROUBLESHOOTING.mdexamples/sample-config.yamlscripts/setup_ollama.pytests/test_ollama_integration.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/env_loader.pycortex/llm/interpreter.pytests/test_ollama_integration.pyscripts/setup_ollama.pycortex/llm_router.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_ollama_integration.py
🧬 Code graph analysis (2)
tests/test_ollama_integration.py (1)
cortex/llm_router.py (6)
LLMProvider(45-50)LLMRouter(76-846)TaskType(32-42)complete(261-327)route_task(206-259)get_stats(497-525)
scripts/setup_ollama.py (1)
tests/test_ollama_integration.py (3)
check_ollama_installed(27-37)check_ollama_running(40-63)main(169-222)
🪛 GitHub Actions: CI
tests/test_ollama_integration.py
[error] 1-1: Black formatting check failed. 1 file would be reformatted. Run 'black --check .' again after formatting, or run 'black .' to auto-format.
🪛 LanguageTool
docs/OLLAMA_INTEGRATION_SUMMARY.md
[style] ~7-~7: Some style guides suggest that commas should set off the year in a month-day-year date.
Context: ...thout API keys. Date: December 26, 2025 Status: ✅ Complete **Related Is...
(MISSING_COMMA_AFTER_YEAR)
[style] ~27-~27: This phrase is redundant (‘I’ stands for ‘interface’). Use simply “API”.
Context: ... Key Features: - OpenAI-compatible API interface - Automatic GPU detection (when availab...
(ACRONYM_TAUTOLOGY)
🪛 markdownlint-cli2 (0.18.1)
docs/OLLAMA_SETUP.md
173-173: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Agent
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.10)
🔇 Additional comments (29)
cortex/env_loader.py (1)
133-134: LGTM - Ollama configuration keys added correctly.The addition of
OLLAMA_BASE_URLandOLLAMA_MODELfollows the existing pattern and enables environment/config source tracking for Ollama. Note that these are configuration parameters rather than authentication keys, but the function handles them appropriately.examples/sample-config.yaml (1)
70-78: LGTM - Ollama configuration is well-structured.The API provider configuration clearly documents the available options and sets sensible defaults for local Ollama usage. The configuration values (base URL and model) align with the Ollama integration documented throughout the PR.
OLLAMA_QUICKSTART.md (2)
31-70: LGTM - Configuration and command reference is clear and accurate.The configuration examples and command reference provide practical guidance that aligns with the Ollama integration. The examples cover both environment variables and config file approaches, giving users flexibility.
21-28: No issue found. The test filetests/test_ollama_integration.pyis explicitly designed to execute directly withpython tests/test_ollama_integration.py, as documented in its usage docstring. The file contains both a__main__block for direct execution and pytest imports, supporting both execution methods. The command in OLLAMA_QUICKSTART.md is correct as-is.README.md (2)
82-82: LGTM - Clear communication of optional API key.The updated prerequisite text correctly indicates that API keys are optional when using Ollama, making it clear to users that they have a free, local alternative.
98-108: LGTM - Excellent multi-provider setup flow.The restructured configuration step clearly presents provider options and makes Ollama the default choice. This aligns well with the PR objectives of providing a free, local-first approach while maintaining cloud API support for users who need it.
docs/OLLAMA_FIX.md (2)
1-84: LGTM - Clear documentation of the Ollama integration approach.The document effectively explains the issues encountered and the fixes applied. The code examples illustrate the key integration points: model selection, OpenAI-compatible API usage, simplified prompts, and flexible response parsing. This provides valuable context for understanding the implementation.
103-112: Performance improvements are impressive and well-documented.The before/after comparison clearly demonstrates the effectiveness of the fixes. The 60s+ timeout reduced to 3-5s response time represents a significant improvement that makes Ollama viable for practical use.
.env.example (1)
1-61: LGTM - Excellent environment configuration template.The template is well-organized with clear sections for each provider. The default Ollama configuration enables immediate local usage, while commented placeholders for cloud APIs prevent accidental key commits. The usage notes provide helpful guidance for getting started with each provider option.
docs/OLLAMA_SETUP.md (4)
17-104: LGTM - Comprehensive and clear setup instructions.The guide provides both quick (automated) and manual setup paths, accommodating different user preferences. The step-by-step manual instructions are thorough and include configuration examples that match the implementation across the PR.
106-134: LGTM - Practical model recommendations with clear hardware guidance.The model recommendations are well-organized and provide the information users need to choose appropriate models based on their hardware. The hardware requirements table helps users avoid out-of-memory issues.
149-230: LGTM - Comprehensive troubleshooting guidance.The troubleshooting section covers common issues users might encounter with clear, actionable solutions. The performance optimization subsections provide additional value for users wanting to maximize Ollama performance.
Note: Static analysis flagged a minor markdown linting issue (fenced code block without language specifier), but this doesn't affect the content quality.
232-333: LGTM - Excellent coverage of advanced topics and resources.The comparison table provides balanced information about trade-offs between local and cloud options. The advanced configuration section and API compatibility examples demonstrate the flexibility of the integration. The quick reference at the end is a helpful addition for users.
docs/TROUBLESHOOTING.md (2)
29-61: LGTM - Excellent prioritization of Ollama as the easiest solution.The restructured solutions section effectively guides users to the free, local option first while preserving cloud API setup instructions. The clear numbering and command examples make it easy for users to follow.
212-325: LGTM - Comprehensive Ollama troubleshooting coverage.The expanded LLM Provider Issues section provides thorough troubleshooting for the most common Ollama issues users will encounter. The solutions are well-organized, progressively detailed, and include practical commands for each scenario.
cortex/llm/interpreter.py (4)
64-66: LGTM on Ollama model configuration approach.The layered configuration (environment → config file → default) provides good flexibility. The default to
llama3.2aligns with the setup script and documentation.Also applies to: 72-94
111-121: LGTM on Ollama client initialization using OpenAI-compatible API.Using the OpenAI SDK with a dummy key and custom base URL is the correct approach for Ollama's OpenAI-compatible endpoint.
194-222: LGTM on enhanced Ollama error handling.The error message now includes the base URL and model name, which significantly improves debuggability when Ollama is unavailable.
251-295: LGTM on robust JSON parsing with multiple format support.The enhanced parsing handles markdown code blocks, JSON repair, and both string arrays and object arrays — addressing common LLM response variations effectively.
docs/LLM_INTEGRATION.md (2)
14-22: LGTM on the provider comparison table.Clear, useful summary of provider characteristics including cost, privacy, and offline capabilities.
56-75: LGTM on Ollama usage example.The example correctly demonstrates
LLMRouterinitialization andcomplete()usage with Ollama.docs/OLLAMA_INTEGRATION_SUMMARY.md (1)
1-6: LGTM on comprehensive integration documentation.Excellent coverage of what was implemented, usage examples, architecture diagrams, and migration guidance. This will be very helpful for users adopting Ollama.
Also applies to: 169-200
scripts/setup_ollama.py (3)
99-106: Security note:shell=Truewith piped commands.While this mirrors the official Ollama installation instructions,
shell=Truewith external content is a security risk. The script is downloading and executing code from the internet. This is acceptable for a user-initiated setup script, but ensure users understand what they're running.Consider adding a confirmation prompt or documentation note about what this command does.
181-220: LGTM on model list with metadata.Clear model information including size, description, and recommended flag helps users make informed choices.
383-501: LGTM on main setup flow.Well-structured interactive flow with proper argument handling, validation, and user feedback. Non-interactive mode for CI is a nice touch.
cortex/llm_router.py (4)
245-253: LGTM on Ollama fallback logic.The cascading fallback (Ollama → Claude → Kimi K2) with proper error raising when no fallback is configured is well-implemented.
428-475: LGTM on synchronous Ollama completion implementation.Proper token extraction with
getattrfallback, zero cost tracking, and helpful error messages when Ollama is unavailable.
714-763: LGTM on async Ollama completion implementation.Mirrors the sync implementation correctly with proper async client usage and consistent error handling.
99-102: LGTM on Ollama cost tracking (zero cost for local inference).Correctly sets Ollama costs to 0.0 and includes Ollama in stats reporting.
Also applies to: 519-523
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive Ollama integration to Cortex Linux, enabling users to run the package manager with free, local LLM inference without requiring cloud API keys. The implementation includes an interactive setup script, LLM router support with OpenAI-compatible API, extensive documentation, and test coverage.
Key Changes:
- Added Ollama as a new LLM provider alongside Claude and OpenAI
- Created interactive setup wizard for easy installation and model selection
- Implemented JSON parsing improvements and flexible response handling for local models
- Added comprehensive documentation with setup guides, troubleshooting, and usage examples
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_ollama_integration.py |
New test suite for Ollama integration with prerequisite checks and router tests |
scripts/setup_ollama.py |
Interactive setup script for Ollama installation, model selection, and configuration |
cortex/llm_router.py |
Core integration adding Ollama provider with sync/async completion methods |
cortex/llm/interpreter.py |
Enhanced JSON parsing, model configuration loading, and OpenAI-compatible API usage |
cortex/env_loader.py |
Added tracking for OLLAMA_BASE_URL and OLLAMA_MODEL environment variables |
docs/OLLAMA_SETUP.md |
Complete setup guide with model recommendations and troubleshooting (333 lines) |
docs/OLLAMA_INTEGRATION_SUMMARY.md |
Implementation summary documenting all changes and architecture |
docs/OLLAMA_FIX.md |
Fix documentation for Ollama integration issues |
docs/LLM_INTEGRATION.md |
Updated LLM integration docs to include Ollama provider |
docs/TROUBLESHOOTING.md |
Added Ollama-specific troubleshooting sections |
README.md |
Updated installation instructions to include Ollama as a free option |
OLLAMA_QUICKSTART.md |
Quick reference guide for Ollama setup and common commands |
.env.example |
Comprehensive environment configuration template with Ollama settings |
examples/sample-config.yaml |
Added Ollama configuration example |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|




Related Issue
Closes #357
Summary
Implemented complete Ollama integration for Cortex Linux:
Ollama Setup Commands
Summary by CodeRabbit
New Features
Documentation
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.