Skip to content

Commit 45ede99

Browse files
authored
Merge pull request #46 from Jakedismo/claude/add-jina-embeddings-provider-011CUrSj3ypAwj1HJ88ibMcC
Add Jina as selectable embeddings provider
2 parents 4e92c73 + df09235 commit 45ede99

File tree

18 files changed

+3438
-12
lines changed

18 files changed

+3438
-12
lines changed

README.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -323,10 +323,18 @@ cargo build --release --bin codegraph-setup --features all-cloud-providers
323323

324324
### Manual Configuration
325325

326+
**Configuration directory: `~/.codegraph/`**
327+
328+
All configuration files are stored in `~/.codegraph/` in TOML format.
329+
326330
Configuration is loaded from (in order):
327-
1. `./.codegraph.toml` (project-specific)
328-
2. `~/.codegraph/config.toml` (global)
329-
3. Environment variables
331+
1. `~/.codegraph/default.toml` (base configuration)
332+
2. `~/.codegraph/{environment}.toml` (e.g., development.toml, production.toml)
333+
3. `~/.codegraph/local.toml` (local overrides, machine-specific)
334+
4. `./config/` (fallback for backward compatibility)
335+
5. Environment variables (CODEGRAPH__* prefix)
336+
337+
**See [Configuration Guide](docs/CONFIGURATION_GUIDE.md) for complete documentation.**
330338

331339
**Full configuration example:**
332340
```toml

config/README.md

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# CodeGraph Configuration Files
2+
3+
## Important: Configuration Directory Migration
4+
5+
**As of the latest version, CodeGraph uses `~/.codegraph` as the primary configuration directory.**
6+
7+
### New Location: `~/.codegraph`
8+
9+
All user-level configuration files should now be placed in:
10+
11+
```
12+
~/.codegraph/
13+
```
14+
15+
This provides a centralized, uniform location for all CodeGraph configuration across your system.
16+
17+
### Why the Change?
18+
19+
- **Centralized**: All CodeGraph configs in one place, regardless of project
20+
- **User-level**: Configurations follow you across different projects
21+
- **Standard practice**: Follows Unix/Linux convention for user configuration
22+
- **Cleaner projects**: Keeps project directories focused on code
23+
24+
### Migration
25+
26+
To migrate your existing configurations:
27+
28+
```bash
29+
# Create the directory
30+
mkdir -p ~/.codegraph
31+
32+
# Copy existing configs
33+
cp config/*.toml ~/.codegraph/
34+
35+
# Or symlink for development (keeps backward compatibility)
36+
ln -s $(pwd)/config ~/.codegraph
37+
```
38+
39+
### Backward Compatibility
40+
41+
CodeGraph maintains backward compatibility by checking directories in this order:
42+
43+
1. **`~/.codegraph/`** (Primary)
44+
2. **`./config/`** (This directory - fallback)
45+
3. **Current directory** (last resort)
46+
47+
If `~/.codegraph` exists, it will be used. Otherwise, CodeGraph falls back to `./config/`.
48+
49+
## Configuration Files in This Directory
50+
51+
This directory contains **example configuration files** that can be copied to `~/.codegraph/`:
52+
53+
- `default.toml` - Base configuration example
54+
- `surrealdb_example.toml` - SurrealDB configuration
55+
- `example_embedding.toml` - Embedding provider configuration
56+
- `example_performance.toml` - Performance tuning
57+
- `production.toml` - Production settings example
58+
59+
## Quick Start
60+
61+
### 1. Initialize User Config
62+
63+
```bash
64+
# Create ~/.codegraph with default configs
65+
mkdir -p ~/.codegraph
66+
cp config/default.toml ~/.codegraph/
67+
```
68+
69+
### 2. Customize Your Configuration
70+
71+
```bash
72+
# Edit your user config
73+
nano ~/.codegraph/default.toml
74+
75+
# Or create environment-specific configs
76+
cp ~/.codegraph/default.toml ~/.codegraph/development.toml
77+
cp ~/.codegraph/default.toml ~/.codegraph/production.toml
78+
```
79+
80+
### 3. Set Environment
81+
82+
```bash
83+
export APP_ENV=development # Loads ~/.codegraph/development.toml
84+
# or
85+
export APP_ENV=production # Loads ~/.codegraph/production.toml
86+
```
87+
88+
## Environment Variables
89+
90+
Override any config value using environment variables:
91+
92+
```bash
93+
# Database backend
94+
export CODEGRAPH__DATABASE__BACKEND=surrealdb
95+
96+
# SurrealDB connection
97+
export CODEGRAPH__DATABASE__SURREALDB__CONNECTION=ws://localhost:8000
98+
99+
# Server port
100+
export CODEGRAPH__SERVER__PORT=8080
101+
```
102+
103+
## Further Documentation
104+
105+
- **[Full Configuration Guide](../docs/CONFIGURATION_GUIDE.md)** - Complete configuration documentation
106+
- **[SurrealDB Guide](../docs/SURREALDB_GUIDE.md)** - SurrealDB-specific configuration
107+
- **[Environment Variables](../docs/CONFIGURATION_GUIDE.md#environment-variables)** - Full list of env vars
108+
109+
## Development
110+
111+
For development, you can continue using this `./config` directory, but we recommend migrating to `~/.codegraph` for consistency:
112+
113+
```bash
114+
# Option 1: Copy to ~/.codegraph
115+
mkdir -p ~/.codegraph && cp config/*.toml ~/.codegraph/
116+
117+
# Option 2: Symlink (for active development)
118+
ln -s $(pwd)/config ~/.codegraph
119+
120+
# Option 3: Keep using ./config (backward compatible)
121+
# CodeGraph will use ./config if ~/.codegraph doesn't exist
122+
```
123+
124+
## Need Help?
125+
126+
- Check the [Configuration Guide](../docs/CONFIGURATION_GUIDE.md)
127+
- See example configs in this directory
128+
- Use `CODEGRAPH__LOGGING__LEVEL=debug` to see which config directory is being used

config/default.toml

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,30 @@ env = "development"
44
host = "0.0.0.0"
55
port = 3000
66

7-
[rocksdb]
7+
# Database configuration
8+
[database]
9+
# Backend options: "rocksdb" (default), "surrealdb"
10+
backend = "rocksdb"
11+
12+
[database.rocksdb]
813
path = "data/graph.db"
914
read_only = false
1015

16+
[database.surrealdb]
17+
# Example SurrealDB configuration (uncomment to use)
18+
# connection = "ws://localhost:8000" # Standard SurrealDB WebSocket connection
19+
# namespace = "codegraph"
20+
# database = "graph"
21+
# auto_migrate = true
22+
# strict_mode = false
23+
# Alternative: file://data/surrealdb/graph.db for embedded mode
24+
25+
# Deprecated: Legacy rocksdb configuration (use database.rocksdb instead)
26+
# This is kept for backward compatibility
27+
# [rocksdb]
28+
# path = "data/graph.db"
29+
# read_only = false
30+
1131
[vector]
1232
dimension = 384
1333
index = "ivf_flat"

config/example_embedding.toml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,26 @@ api_base = "https://api.openai.com/v1"
1414
max_retries = 3
1515
timeout_secs = 30
1616

17+
# Example Jina configuration (uncomment to use)
18+
# [embedding]
19+
# provider = "jina"
20+
# dimension = 1024
21+
# cache_enabled = true
22+
# cache_ttl_secs = 3600
23+
# normalize_embeddings = true
24+
#
25+
# [embedding.jina]
26+
# model = "jina-embeddings-v4"
27+
# api_key_env = "JINA_API_KEY"
28+
# api_base = "https://api.jina.ai/v1"
29+
# max_retries = 3
30+
# timeout_secs = 30
31+
# task = "code.query"
32+
# late_chunking = true
33+
# enable_reranking = true
34+
# reranking_model = "jina-reranker-v3"
35+
# reranking_top_n = 10
36+
1737
[performance]
1838
mode = "balanced"
1939

config/surrealdb_example.toml

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# CodeGraph Configuration with SurrealDB
2+
3+
[database]
4+
# Select SurrealDB as the database backend
5+
backend = "surrealdb"
6+
7+
# RocksDB configuration (not used when backend is surrealdb, but kept for backward compatibility)
8+
[database.rocksdb]
9+
path = "data/graph.db"
10+
read_only = false
11+
12+
# SurrealDB configuration
13+
[database.surrealdb]
14+
# Connection string options:
15+
# - WebSocket (default): "ws://localhost:8000"
16+
# - Local file: "file://data/surrealdb/graph.db"
17+
# - Memory (testing): "mem://"
18+
# - Remote HTTP: "http://localhost:8000"
19+
# - Remote HTTPS: "https://example.com:8000"
20+
connection = "ws://localhost:8000"
21+
22+
# Namespace for multi-tenancy (default: "codegraph")
23+
namespace = "codegraph"
24+
25+
# Database name (default: "graph")
26+
database = "graph"
27+
28+
# Optional: Authentication credentials
29+
# username = "root"
30+
# password = "root" # Can also be set via CODEGRAPH__DATABASE__SURREALDB__PASSWORD env var
31+
32+
# Enable strict schema validation (default: false)
33+
# When true, SurrealDB will enforce the defined schema strictly
34+
# When false, allows for schema flexibility and easier migrations
35+
strict_mode = false
36+
37+
# Auto-apply migrations on startup (default: true)
38+
# When true, automatically runs pending migrations when connecting
39+
auto_migrate = true
40+
41+
# Example: Remote SurrealDB server configuration
42+
# [database.surrealdb]
43+
# connection = "https://your-surrealdb-server.com:8000"
44+
# namespace = "production"
45+
# database = "codegraph"
46+
# username = "admin"
47+
# # Password should be set via environment variable:
48+
# # CODEGRAPH__DATABASE__SURREALDB__PASSWORD=your_password
49+
# strict_mode = true
50+
# auto_migrate = false
51+
52+
[server]
53+
host = "0.0.0.0"
54+
port = 3000
55+
56+
[vector]
57+
dimension = 1024
58+
59+
[logging]
60+
level = "info"
61+
62+
[security]
63+
require_auth = false
64+
rate_limit_per_minute = 1200

0 commit comments

Comments
 (0)