Commit 8a3d44d
committed
feat: Enable multi-codebase support with project-relative storage
## Problem: Hardcoded Paths Breaking Multi-Codebase Support
Previously, all CodeGraph indexes and databases used hardcoded `.codegraph`
paths relative to current working directory. This broke when:
- Indexing multiple projects (each needs isolated storage)
- Running MCP server from different directory than project root
- Managing multiple codebases simultaneously
## Solution: Project-Relative Storage Architecture
### Core Changes
**1. IndexerConfig Enhancement**
- Added `project_root: PathBuf` field
- Explicitly tracks which project is being indexed
- Defaults to current directory for backward compatibility
**2. ProjectIndexer Updates**
- Added `project_root` field to store project location
- Updated `new()` to use `CodeGraph::new_with_path(project_root/.codegraph/db)`
- All storage operations now use `self.project_root`:
- FAISS indexes: `project_root/.codegraph/faiss.index`
- Embeddings: `project_root/.codegraph/embeddings.json`
- Metadata: `project_root/.codegraph/index.json`
- Shards: `project_root/.codegraph/shards/{lang,path}/`
**3. CLI Improvements**
- `handle_index()`: Sets `project_root` from path parameter
- Clean command: Uses project-relative `.codegraph` directory
- Benchmark command: Uses project-relative paths
- Proper canonicalization of project paths
### Multi-Codebase Workflow
**Index multiple projects (isolated storage):**
```bash
codegraph index /home/user/project-a # → /home/user/project-a/.codegraph/
codegraph index /home/user/project-b # → /home/user/project-b/.codegraph/
codegraph index /opt/services/api # → /opt/services/api/.codegraph/
```
**Serve different projects:**
```bash
# Terminal 1
cd /home/user/project-a && codegraph start stdio
# Terminal 2
cd /home/user/project-b && codegraph start stdio
```
**Claude Desktop multi-project config:**
```json
{
"mcpServers": {
"codegraph-api": {
"command": "codegraph",
"args": ["start", "stdio"],
"cwd": "/opt/services/api"
},
"codegraph-frontend": {
"command": "codegraph",
"args": ["start", "stdio"],
"cwd": "/opt/services/frontend"
}
}
}
```
### Directory Structure (Each Project Isolated)
```
project-a/
├── src/
├── .codegraph/ # Project A's index
│ ├── db/ # RocksDB (isolated)
│ ├── faiss.index # FAISS index (isolated)
│ ├── faiss_ids.json
│ ├── shards/
│ │ ├── lang/
│ │ └── path/
│ └── index.json
project-b/
├── app/
├── .codegraph/ # Project B's index (completely separate)
│ ├── db/ # Different RocksDB instance
│ ├── faiss.index # Different FAISS index
│ └── ...
```
## Protocol Compliance
### MCP Official SDK
- Uses official `rmcp` framework
- Proper tool schemas with `JsonSchema` derive
- Standard error codes (-32602 invalid params, -32603 internal error)
- STDIO transport (official MCP protocol)
### Error Handling
```rust
if query.is_empty() {
return Err(McpError {
code: -32602,
message: "Query cannot be empty".to_string(),
data: None,
});
}
```
### Tool Registration
- `#[tool_router]` macro for automatic registration
- `#[tool(description = "...")]` for each tool
- `Parameters<T>` wrapper for type-safe parameter parsing
## Performance Features
**1. Read-Only Fallback**
- MCP server gracefully falls back to read-only mode
- Allows multiple servers to read same index
- No RocksDB lock conflicts
**2. Sharded FAISS Indexes**
- Language shards: Search only Rust/TypeScript/Python/etc
- Path shards: Search only src/lib/tests/etc
- Faster queries by filtering before FAISS search
**3. Batch Embedding** (from Phase 1)
- 10-50x speedup from batched GPU processing
## Benefits
✅ **Multi-Codebase**: Each project has isolated `.codegraph/` storage
✅ **No Cross-Contamination**: Clear project boundaries
✅ **Scalability**: Run multiple MCP servers (one per project)
✅ **Protocol Compliant**: Uses official `rmcp` SDK
✅ **Performance**: Sharded indexes, batch processing, read-only mode
✅ **Developer Experience**: Simple `cd project && codegraph index .`
## Files Modified
- `crates/codegraph-mcp/src/indexer.rs`
- Added `project_root` to `IndexerConfig` and `ProjectIndexer`
- Updated all `.codegraph` paths to use `self.project_root`
- Fixed metadata/embedding paths to be project-relative
- `crates/codegraph-mcp/src/bin/codegraph.rs`
- Set `project_root` from path parameter in index command
- Fixed clean command to use project-relative paths
- Updated benchmark to use project-relative paths
- `MCP_IMPROVEMENTS.md`
- Comprehensive documentation of multi-codebase architecture
- Protocol compliance details
- Migration guide and testing scenarios
## Migration
**Existing users**: Re-index to ensure proper structure
```bash
cd /path/to/project && codegraph index . --force
```
**New users**: Just works automatically
```bash
cd /path/to/project && codegraph index .
```
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>1 parent 5a21b75 commit 8a3d44d
File tree
3 files changed
+678
-9
lines changed- crates/codegraph-mcp/src
- bin
3 files changed
+678
-9
lines changed
0 commit comments