Commit 623838b
committed
feat: Migrate to OpenAI Responses API with full reasoning model support
BREAKING CHANGE: OpenAI provider now uses Responses API (/v1/responses) instead
of Chat Completions API. This is required for reasoning models (o1, o3, o4-mini).
## Major Changes
### Responses API Migration
- **OpenAI Provider**: Completely rewritten to use `/v1/responses` endpoint
- **OpenAI-Compatible Provider**: Supports both Responses API and Chat Completions API with automatic fallback
- **Request Format**: Changed from `messages` array to `input` string + `instructions`
- **Response Format**: Changed from `choices[0].message.content` to `output_text`
### Reasoning Model Support
Added full support for OpenAI's reasoning models (o1, o3, o4-mini, GPT-5):
1. **Reasoning Effort Parameter**: Control thinking depth with "minimal", "low", "medium", "high"
- `minimal`: Fast, basic reasoning (GPT-5 only)
- `low`: Quick responses with light reasoning
- `medium`: Balanced reasoning (recommended)
- `high`: Deep reasoning for complex problems
2. **max_output_tokens Parameter**: New token limit parameter for Responses API
- Replaces `max_tokens` for reasoning models
- Falls back to `max_tokens` if not set for backward compatibility
3. **Automatic Model Detection**: OpenAI provider detects reasoning models and:
- Disables temperature/top_p (not supported by reasoning models)
- Enables reasoning_effort parameter
- Uses proper token parameter names
### Configuration Updates
**GenerationConfig** (crates/codegraph-ai/src/llm_provider.rs):
```rust
pub struct GenerationConfig {
pub temperature: f32, // Not supported by reasoning models
pub max_tokens: Option<usize>, // Legacy parameter
pub max_output_tokens: Option<usize>, // NEW: For Responses API
pub reasoning_effort: Option<String>, // NEW: For reasoning models
pub top_p: Option<f32>, // Not supported by reasoning models
// ...
}
```
**LLMConfig** (crates/codegraph-core/src/config_manager.rs):
```rust
pub struct LLMConfig {
pub max_tokens: usize, // Legacy
pub max_output_tokens: Option<usize>, // NEW
pub reasoning_effort: Option<String>, // NEW
// ...
}
```
### Provider Implementations
**OpenAI Provider** (crates/codegraph-ai/src/openai_llm_provider.rs):
- Uses `/v1/responses` endpoint exclusively
- Automatic reasoning model detection
- Proper parameter handling based on model type
- Request: `{ model, input, instructions, max_output_tokens, reasoning_effort }`
- Response: `{ output_text, usage: { prompt_tokens, output_tokens, reasoning_tokens } }`
**OpenAI-Compatible Provider** (crates/codegraph-ai/src/openai_compatible_provider.rs):
- Defaults to Responses API (`use_responses_api: true`)
- Falls back to Chat Completions API for compatibility
- Supports both `max_output_tokens` and `max_completion_tokens`
- Works with LM Studio, Ollama v1 endpoint, and custom APIs
### Documentation Updates
**docs/CLOUD_PROVIDERS.md**:
- Added "Responses API & Reasoning Models" section
- Detailed explanation of API format differences
- Configuration examples for reasoning models
- Reasoning effort level descriptions
- Migration guide from Chat Completions API
**.codegraph.toml.example**:
- Added `max_output_tokens` parameter with documentation
- Added `reasoning_effort` parameter with options
- Clarified which parameters apply to reasoning vs standard models
### Backward Compatibility
- OpenAI-compatible provider can fall back to Chat Completions API
- `max_output_tokens` falls back to `max_tokens` if not set
- Configuration with only `max_tokens` continues to work
- Standard models (gpt-4o, gpt-4-turbo) work as before
### Testing
Added tests for:
- Reasoning model detection (o1, o3, o4, gpt-5)
- Standard model detection (gpt-4o, gpt-4-turbo)
- OpenAI-compatible provider configuration
- Both API format support
## Migration Guide
### For OpenAI Users
**Before (Chat Completions API)**:
```toml
[llm]
provider = "openai"
model = "gpt-4o"
max_tokens = 4096
```
**After (Responses API)** - Still works, but consider:
```toml
[llm]
provider = "openai"
model = "gpt-4o"
max_output_tokens = 4096 # Preferred for Responses API
```
**For Reasoning Models**:
```toml
[llm]
provider = "openai"
model = "o3-mini"
max_output_tokens = 25000
reasoning_effort = "medium" # NEW: Control reasoning depth
# Note: temperature/top_p ignored for reasoning models
```
### For OpenAI-Compatible Users
No changes required - the provider automatically uses Responses API if available and falls back to Chat Completions API otherwise.
To force Chat Completions API (e.g., for older systems):
```rust
let config = OpenAICompatibleConfig {
use_responses_api: false, // Force legacy API
...
};
```
## Why This Change?
1. **Future-Proof**: Responses API is OpenAI's modern standard
2. **Reasoning Models**: Required for o1, o3, o4-mini support
3. **Better Features**: More granular control over model behavior
4. **Token Tracking**: Separate tracking of reasoning tokens
5. **Performance**: Optimized for latest models
## Files Modified
- `crates/codegraph-ai/src/llm_provider.rs`: Added reasoning parameters to GenerationConfig
- `crates/codegraph-ai/src/openai_llm_provider.rs`: Complete rewrite for Responses API
- `crates/codegraph-ai/src/openai_compatible_provider.rs`: Dual API support
- `crates/codegraph-core/src/config_manager.rs`: Added reasoning config fields
- `.codegraph.toml.example`: Documented new parameters
- `docs/CLOUD_PROVIDERS.md`: Comprehensive Responses API documentation
## References
- OpenAI Responses API: https://platform.openai.com/docs/api-reference/responses
- Reasoning Models: https://platform.openai.com/docs/guides/reasoning
- Azure OpenAI Reasoning: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/reasoning1 parent 28f26d1 commit 623838b
File tree
6 files changed
+387
-113
lines changed- crates
- codegraph-ai/src
- codegraph-core/src
- docs
6 files changed
+387
-113
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
89 | | - | |
| 89 | + | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
92 | 102 | | |
93 | 103 | | |
94 | 104 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
33 | 37 | | |
34 | | - | |
| 38 | + | |
35 | 39 | | |
36 | | - | |
| 40 | + | |
37 | 41 | | |
38 | 42 | | |
39 | 43 | | |
| |||
44 | 48 | | |
45 | 49 | | |
46 | 50 | | |
| 51 | + | |
| 52 | + | |
47 | 53 | | |
48 | 54 | | |
49 | 55 | | |
| |||
0 commit comments