Skip to content

Commit 8a3d44d

Browse files
committed
feat: Enable multi-codebase support with project-relative storage
## Problem: Hardcoded Paths Breaking Multi-Codebase Support Previously, all CodeGraph indexes and databases used hardcoded `.codegraph` paths relative to current working directory. This broke when: - Indexing multiple projects (each needs isolated storage) - Running MCP server from different directory than project root - Managing multiple codebases simultaneously ## Solution: Project-Relative Storage Architecture ### Core Changes **1. IndexerConfig Enhancement** - Added `project_root: PathBuf` field - Explicitly tracks which project is being indexed - Defaults to current directory for backward compatibility **2. ProjectIndexer Updates** - Added `project_root` field to store project location - Updated `new()` to use `CodeGraph::new_with_path(project_root/.codegraph/db)` - All storage operations now use `self.project_root`: - FAISS indexes: `project_root/.codegraph/faiss.index` - Embeddings: `project_root/.codegraph/embeddings.json` - Metadata: `project_root/.codegraph/index.json` - Shards: `project_root/.codegraph/shards/{lang,path}/` **3. CLI Improvements** - `handle_index()`: Sets `project_root` from path parameter - Clean command: Uses project-relative `.codegraph` directory - Benchmark command: Uses project-relative paths - Proper canonicalization of project paths ### Multi-Codebase Workflow **Index multiple projects (isolated storage):** ```bash codegraph index /home/user/project-a # → /home/user/project-a/.codegraph/ codegraph index /home/user/project-b # → /home/user/project-b/.codegraph/ codegraph index /opt/services/api # → /opt/services/api/.codegraph/ ``` **Serve different projects:** ```bash # Terminal 1 cd /home/user/project-a && codegraph start stdio # Terminal 2 cd /home/user/project-b && codegraph start stdio ``` **Claude Desktop multi-project config:** ```json { "mcpServers": { "codegraph-api": { "command": "codegraph", "args": ["start", "stdio"], "cwd": "/opt/services/api" }, "codegraph-frontend": { "command": "codegraph", "args": ["start", "stdio"], "cwd": "/opt/services/frontend" } } } ``` ### Directory Structure (Each Project Isolated) ``` project-a/ ├── src/ ├── .codegraph/ # Project A's index │ ├── db/ # RocksDB (isolated) │ ├── faiss.index # FAISS index (isolated) │ ├── faiss_ids.json │ ├── shards/ │ │ ├── lang/ │ │ └── path/ │ └── index.json project-b/ ├── app/ ├── .codegraph/ # Project B's index (completely separate) │ ├── db/ # Different RocksDB instance │ ├── faiss.index # Different FAISS index │ └── ... ``` ## Protocol Compliance ### MCP Official SDK - Uses official `rmcp` framework - Proper tool schemas with `JsonSchema` derive - Standard error codes (-32602 invalid params, -32603 internal error) - STDIO transport (official MCP protocol) ### Error Handling ```rust if query.is_empty() { return Err(McpError { code: -32602, message: "Query cannot be empty".to_string(), data: None, }); } ``` ### Tool Registration - `#[tool_router]` macro for automatic registration - `#[tool(description = "...")]` for each tool - `Parameters<T>` wrapper for type-safe parameter parsing ## Performance Features **1. Read-Only Fallback** - MCP server gracefully falls back to read-only mode - Allows multiple servers to read same index - No RocksDB lock conflicts **2. Sharded FAISS Indexes** - Language shards: Search only Rust/TypeScript/Python/etc - Path shards: Search only src/lib/tests/etc - Faster queries by filtering before FAISS search **3. Batch Embedding** (from Phase 1) - 10-50x speedup from batched GPU processing ## Benefits ✅ **Multi-Codebase**: Each project has isolated `.codegraph/` storage ✅ **No Cross-Contamination**: Clear project boundaries ✅ **Scalability**: Run multiple MCP servers (one per project) ✅ **Protocol Compliant**: Uses official `rmcp` SDK ✅ **Performance**: Sharded indexes, batch processing, read-only mode ✅ **Developer Experience**: Simple `cd project && codegraph index .` ## Files Modified - `crates/codegraph-mcp/src/indexer.rs` - Added `project_root` to `IndexerConfig` and `ProjectIndexer` - Updated all `.codegraph` paths to use `self.project_root` - Fixed metadata/embedding paths to be project-relative - `crates/codegraph-mcp/src/bin/codegraph.rs` - Set `project_root` from path parameter in index command - Fixed clean command to use project-relative paths - Updated benchmark to use project-relative paths - `MCP_IMPROVEMENTS.md` - Comprehensive documentation of multi-codebase architecture - Protocol compliance details - Migration guide and testing scenarios ## Migration **Existing users**: Re-index to ensure proper structure ```bash cd /path/to/project && codegraph index . --force ``` **New users**: Just works automatically ```bash cd /path/to/project && codegraph index . ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 5a21b75 commit 8a3d44d

File tree

3 files changed

+678
-9
lines changed

3 files changed

+678
-9
lines changed

0 commit comments

Comments
 (0)