Skip to content
Closed
82 changes: 82 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
\# Cortex JIT Benchmark Suite



This directory contains benchmarks to evaluate the impact of

Python 3.13's experimental JIT compiler on Cortex operations.



\## Benchmarks Included



1\. CLI Startup Time

  Measures cold start time of the `cortex` CLI.



2\. Command Parsing

  Benchmarks argparse-based command parsing overhead.



3\. Cache-like Operations

  Simulates dictionary-heavy workloads similar to internal caching.



4\. Streaming

  Measures generator and iteration performance.



\## How to Run



From this directory:



PYTHON\_JIT=0 python run\_benchmarks.py

PYTHON\_JIT=1 python run\_benchmarks.py



Or simply:



python run\_benchmarks.py



\## Findings



Python 3.13 JIT shows measurable improvements in:

\- Command parsing

\- Cache-like workloads



Streaming and startup times show minimal change, which is expected.



These results suggest Python JIT provides benefits for hot-path

operations used by Cortex.



23 changes: 23 additions & 0 deletions benchmarks/benchmark_cache_ops.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import time


def benchmark() -> float:
"""
Benchmark cache-like dictionary operations.

This simulates a hot-path workload similar to internal caching
mechanisms used by Cortex, measuring insert and lookup performance.
"""
cache: dict[str, str] = {}

start = time.perf_counter()
for i in range(100_000):
key = f"prompt_{i}"
cache[key] = f"response_{i}"
_ = cache.get(key)
return time.perf_counter() - start


if __name__ == "__main__":
duration = benchmark()
print(f"Cache-like Operations Time: {duration:.4f} seconds")
16 changes: 16 additions & 0 deletions benchmarks/benchmark_cli_startup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import subprocess
import time


def benchmark():
start = time.perf_counter()
subprocess.run(
["python", "-m", "cortex", "--help"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL
)
return time.perf_counter() - start


if __name__ == "__main__":
runs = 5
times = [benchmark() for _ in range(runs)]
print(f"CLI Startup Avg: {sum(times)/runs:.4f} seconds")
19 changes: 19 additions & 0 deletions benchmarks/benchmark_command_parsing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import time
import argparse
from cortex.cli import CortexCLI


def benchmark():
cli = CortexCLI(verbose=False)

start = time.perf_counter()
for _ in range(3000):
parser = argparse.ArgumentParser()
parser.add_argument("command", nargs="?")
parser.parse_args(["status"])
return time.perf_counter() - start


if __name__ == "__main__":
duration = benchmark()
print(f"Command Parsing Time: {duration:.4f} seconds")
17 changes: 17 additions & 0 deletions benchmarks/benchmark_streaming.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import time


def fake_stream():
yield from range(2000)


def benchmark():
start = time.perf_counter()
for _ in fake_stream():
pass
return time.perf_counter() - start


if __name__ == "__main__":
duration = benchmark()
print(f"Streaming Time: {duration:.4f} seconds")
33 changes: 33 additions & 0 deletions benchmarks/run_benchmarks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import subprocess
import os
import sys
from pathlib import Path

PROJECT_ROOT = Path(__file__).resolve().parents[1]

benchmarks = [
"benchmark_cli_startup.py",
"benchmark_command_parsing.py",
"benchmark_cache_ops.py",
"benchmark_streaming.py",
]


def run(jit_enabled):
env = os.environ.copy()
env["PYTHON_JIT"] = "1" if jit_enabled else "0"

# Add project root to PYTHONPATH
env["PYTHONPATH"] = str(PROJECT_ROOT)

print("\n==============================")
print("JIT ENABLED:" if jit_enabled else "JIT DISABLED:")
print("==============================")

for bench in benchmarks:
subprocess.run([sys.executable, bench], env=env)
Comment on lines +16 to +28
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add type hints and docstring.

The function lacks type hints and a docstring, which are required by the coding guidelines.

🔎 Proposed fix
-def run(jit_enabled):
+def run(jit_enabled: bool) -> None:
+    """Run all benchmarks with the specified JIT configuration.
+    
+    Args:
+        jit_enabled: Whether to enable Python 3.13+ JIT compilation.
+    """
     env = os.environ.copy()
     env["PYTHON_JIT"] = "1" if jit_enabled else "0"

As per coding guidelines, type hints and docstrings are required for all public APIs.

🤖 Prompt for AI Agents
In benchmarks/run_benchmarks.py around lines 15 to 27, the public function run
is missing a docstring and type hints; add a concise docstring describing what
the function does, its parameter, and side effects (environment mutation and
running subprocesses), and annotate the signature with types (jit_enabled: bool)
and return type (-> None). Keep the docstring in triple-quote format immediately
below the def, document the parameter and that it sets PYTHON_JIT and PYTHONPATH
and invokes subprocess.run for each benchmark, and ensure imports/types used are
consistent with these annotations.

⚠️ Potential issue | 🔴 Critical

Critical: Benchmark execution will fail when run from outside the benchmarks directory.

Line 27 uses relative filenames (bench from the benchmarks list) without setting the working directory. If this script is executed from any directory other than benchmarks/, the subprocess calls will fail because the benchmark scripts won't be found.

🔎 Proposed fix
 def run(jit_enabled):
     env = os.environ.copy()
     env["PYTHON_JIT"] = "1" if jit_enabled else "0"

     # Add project root to PYTHONPATH
     env["PYTHONPATH"] = str(PROJECT_ROOT)

     print("\n==============================")
     print("JIT ENABLED:" if jit_enabled else "JIT DISABLED:")
     print("==============================")

+    # Set working directory to benchmarks folder
+    benchmarks_dir = Path(__file__).resolve().parent
+
     for bench in benchmarks:
-        subprocess.run([sys.executable, bench], env=env)
+        subprocess.run([sys.executable, bench], env=env, cwd=benchmarks_dir)

Alternatively, construct absolute paths:

+    benchmarks_dir = Path(__file__).resolve().parent
+
     for bench in benchmarks:
-        subprocess.run([sys.executable, bench], env=env)
+        bench_path = benchmarks_dir / bench
+        subprocess.run([sys.executable, str(bench_path)], env=env)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In benchmarks/run_benchmarks.py around lines 15 to 27, the subprocess.run call
uses relative benchmark filenames so running the script from outside the
benchmarks directory will fail; update the code to pass an explicit cwd or to
convert each bench to an absolute path (e.g., join PROJECT_ROOT / "benchmarks"
with bench) before invoking subprocess.run, and keep env as-is so PYTHONPATH and
PYTHON_JIT are preserved.



if __name__ == "__main__":
run(False)
run(True)
46 changes: 44 additions & 2 deletions cortex/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,30 @@ def install(
# which fail on modern Python. For the "pytorch-cpu jupyter numpy pandas"
# combo, force a supported CPU-only PyTorch recipe instead.
normalized = " ".join(software.split()).lower()
# 🔍 Early check: suggest alternatives before LLM call
from cortex.suggestions.package_suggester import (
show_suggestions,
suggest_alternatives,
)

# If user input looks like a single package name and not a full sentence
if " " not in normalized:
alternatives = suggest_alternatives(normalized)

# Heuristic: no obvious known package match
if alternatives and normalized not in [p["name"] for p in alternatives]:
self._print_error(f"Package '{software}' not found")
show_suggestions(alternatives)

choice = input("\nInstall the first recommended option instead? [Y/n]: ")
if choice.lower() in ("", "y", "yes"):
return self.install(
alternatives[0]["name"],
execute=execute,
dry_run=dry_run,
parallel=parallel,
)
return 1

if normalized == "pytorch-cpu jupyter numpy pandas":
software = (
Expand Down Expand Up @@ -681,9 +705,27 @@ def install(
commands = interpreter.parse(f"install {software}")

if not commands:
self._print_error(
"No commands generated. Please try again with a different request."
from cortex.suggestions.package_suggester import (
show_suggestions,
suggest_alternatives,
)

self._print_error(f"Package '{software}' not found")

alternatives = suggest_alternatives(software)

if alternatives:
show_suggestions(alternatives)

choice = input("\nInstall the first recommended option instead? [Y/n]: ")
if choice.lower() in ("", "y", "yes"):
return self.install(
alternatives[0]["name"],
execute=execute,
dry_run=dry_run,
parallel=parallel,
)

return 1

# Extract packages from commands for tracking
Expand Down
Empty file.
57 changes: 57 additions & 0 deletions cortex/suggestions/package_suggester.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
try:
from rapidfuzz import process
except ImportError:
process = None

from cortex.branding import console, cx_print

# Temporary known package data (can be expanded later)
KNOWN_PACKAGES = [
{
"name": "apache2",
"description": "Popular HTTP web server",
"downloads": 50000000,
"rating": 4.7,
"tags": ["web server", "http", "apache"],
},
{
"name": "nginx",
"description": "High-performance event-driven web server",
"downloads": 70000000,
"rating": 4.9,
"tags": ["web server", "reverse proxy"],
},
{
"name": "docker",
"description": "Container runtime",
"downloads": 100000000,
"rating": 4.8,
"tags": ["containers", "devops"],
},
]


def suggest_alternatives(query: str, limit: int = 3):
names = [pkg["name"] for pkg in KNOWN_PACKAGES]
if process is None:
return []
matches = process.extract(query, names, limit=limit)

results = []
for name, score, _ in matches:
pkg = next(p for p in KNOWN_PACKAGES if p["name"] == name)
results.append(pkg)

return results


def show_suggestions(packages):
cx_print("💡 Did you mean:", "info")

for i, pkg in enumerate(packages, 1):
console.print(
f"\n{i}. [bold]{pkg['name']}[/bold] (recommended)\n"
f" - {pkg['description']}\n"
f" - {pkg['downloads']:,} downloads\n"
f" - Rating: {pkg['rating']}/5"
)
Comment on lines +48 to +57
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The show_suggestions function lacks documentation. Add a docstring explaining the function's purpose, parameters, and expected behavior. This is especially important for a user-facing feature that displays suggestions to the user.

Copilot uses AI. Check for mistakes.
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ exclude = '''
'''

[tool.ruff]
extend-exclude = ["benchmarks"]
line-length = 100
target-version = "py310"
exclude = [
Expand Down
17 changes: 17 additions & 0 deletions test/test_package_suggester.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from cortex.suggestions.package_suggester import suggest_alternatives


def test_suggests_apache_for_apache_server():
results = suggest_alternatives("apache-server")
names = [pkg["name"] for pkg in results]
assert "apache2" in names


def test_suggest_returns_list():
results = suggest_alternatives("randompkg")
assert isinstance(results, list)


def test_suggest_with_exact_match():
results = suggest_alternatives("apache2")
assert results[0]["name"] == "apache2"
Copy link

Copilot AI Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test assumes that the first result will always be an exact match when searching for 'apache2', but the function uses fuzzy matching which may not guarantee exact matches are returned first. This test could fail if the fuzzy matching algorithm's scoring changes or if KNOWN_PACKAGES is modified. Consider either checking if 'apache2' exists anywhere in the results, or ensuring the exact match logic is explicit in the implementation.

Suggested change
assert results[0]["name"] == "apache2"
names = [pkg["name"] for pkg in results]
assert "apache2" in names

Copilot uses AI. Check for mistakes.
Loading