Skip to content

Conversation

@CodeWithKyrian
Copy link
Owner

@CodeWithKyrian CodeWithKyrian commented Dec 6, 2025

This PR builds upon the existing Chroma v2 API migration (from #14 ) by adding new v2 API endpoints, significantly improving error handling, and introducing major architectural improvements and new features. The package has been enhanced with better developer experience, type safety, and additional functionality while maintaining backward compatibility where possible.

Motivation and Context

Chroma has released version 1.0 with a new v2 API that provides a Rust backend and more consistent approach to vector database operations. The previous package version (0.4.0) was built against Chroma 0.4.0 and the v1 API, which had inconsistencies in error handling and response formats.

A previous PR already migrated the package to support Chroma v1.0 and the v2 API. This PR extends that work by:

  • Adding support for new v2 API endpoints and features (collection forking, tenant/database management improvements)
  • Completely overhauling error handling to leverage Chroma v2 API's consistent error response format
  • Modernizing the HTTP client layer using PSR-18/PSR-17 standards for better interoperability
  • Simplifying the API surface by removing unnecessary abstraction layers (CollectionResource)
  • Adding support for Chroma Cloud with dedicated connection helpers
  • Introducing type-safe query builders and Record objects for better developer experience
  • Enhancing validation and error messages throughout the codebase

What's Changed

API Client Enhancements

  • Enhanced the existing Api class with new v2 API endpoints (collection forking, improved tenant/database management)
  • Migrated from GuzzleHttp to PSR-18/PSR-17 with HTTP Discovery for automatic client detection
  • Removed hard dependency on GuzzleHttp (now optional dev dependency)
  • Added generic header support via withHeader() and withHeaders() methods
  • Completely overhauled error handling with consistent exception mapping based on Chroma v2 API's standardized error format

Collection Model Simplification

  • Removed CollectionResource wrapper class - Collection is now the direct resource model
  • Collections returned from listCollections() can be used immediately without conversion
  • Added setEmbeddingFunction() method for runtime embedding function assignment
  • Enhanced collection methods with better validation and error messages

New Features

  • Collection Forking: Added support for forking collections (Chroma Cloud only)
  • Record Objects: Introduced Record and ScoredRecord types for structured data handling
    • Support for fluent API: Record::make('id')->withDocument('text')->withMetadata(['key' => 'value'])
    • add(), update(), and upsert() now accept arrays of Record objects
    • get() and query() responses can be converted to records via asRecords()
  • Query Builder: Added Where query builder for type-safe filtering
    • Metadata filtering: Where::field('category')->eq('news')
    • Document filtering: Where::document()->contains('text')
    • Logical operators: Where::all(), Where::any()
  • Includes Enum: Type-safe field inclusion for responses
  • Mixed Embeddings: Support for partial embeddings with automatic batch generation of missing ones

Connection Improvements

  • Added ChromaDB::local() helper for local/self-hosted instances
  • Added ChromaDB::cloud() helper for Chroma Cloud connections
  • Deprecated ChromaDB::client() in favor of explicit local() or factory() usage
  • Moved reset() from static method to instance method: $client->reset()
  • Replaced withAuthToken() with generic withHeader('X-Chroma-Token', $token)

Exception Handling

  • Moved exceptions from Generated/Exceptions/ to Exceptions/ namespace
  • Removed "Chroma" prefix from exception class names for cleaner API
  • Improved exception creation with ChromaException::create() factory method
  • Better error parsing from Chroma's consistent v2 API error format
  • Added ConnectionException for network-level errors

Testing Infrastructure

  • Reorganized tests into Feature/ directory structure
  • Added Chroma CLI integration for local server management in tests
  • Updated GitHub Actions to use Chroma CLI instead of Docker services
  • Expanded test coverage with comprehensive negative test scenarios
  • Added dedicated test fixtures and configuration

Documentation and Examples

  • Completely rewrote README with modern examples and clearer structure
  • Added new document-chunking-cloud example demonstrating real-world usage
  • Reorganized examples into structured folders
  • Enhanced inline documentation with PHPDoc improvements

Code Organization

  • Further organized request/response models in dedicated Requests/ and Responses/ folders
  • Consolidated model classes under Models/ namespace
  • Added Types/ namespace for value objects (Record, ScoredRecord, Includes)
  • Added Query/ namespace for query builder classes

Embedding Functions

  • Refactored all embedding functions to use PSR-18 HTTP client
  • Improved error handling and response parsing
  • Maintained backward compatibility for existing embedding function implementations

Breaking Changes

API Client

  • Removed: ChromaApiClient class - replaced with Api class
  • Changed: HTTP client must implement PSR-18 ClientInterface (GuzzleHttp still works via discovery)
  • Removed: Factory::withHttpClient() method - HTTP client is auto-discovered

Connection Methods

  • Deprecated: ChromaDB::client() - use ChromaDB::local()->connect() or ChromaDB::factory()->connect() instead
  • Removed: ChromaDB::reset() static method - use $client->reset() instance method instead
  • Deprecated: Factory::withAuthToken() - use Factory::withHeader('X-Chroma-Token', $token) instead

Collection Resource

  • Removed: CollectionResource class - Collection is now the direct resource
  • Changed: listCollections() now returns Collection[] instead of CollectionResource[]
  • Changed: Collection methods no longer require embedding function at creation if not provided initially

Request/Response Models

  • Removed: All classes from Generated/Requests/ and Generated/Responses/ namespaces
  • Changed: Request classes moved to Requests/ namespace with updated method signatures
  • Removed: Image parameters from all item-related requests (was never fully supported)

Exceptions

  • Changed: Exception class names no longer have "Chroma" prefix
    • ChromaNotFoundExceptionNotFoundException
    • ChromaConnectionExceptionConnectionException
    • ChromaValueExceptionValueException
    • And similar for all other exceptions
  • Changed: Exception namespace moved from Generated\Exceptions to Exceptions
  • Changed: Exception creation now uses ChromaException::create() factory instead of throwSpecific()

Dependencies

  • Changed: guzzlehttp/guzzle moved from require to require-dev
  • Added: psr/http-client, psr/http-factory, and php-http/discovery as required dependencies

Closes #12

…ures

- Added methods for tenant management: create, update, and retrieve tenants.
- Introduced new endpoints for collection management: create, update, delete, and query collections.
- Updated existing methods to improve clarity and functionality, including health checks and user identity retrieval.
- Enhanced request classes with detailed parameter documentation for better usability.
- Refactored existing methods to align with new naming conventions for consistency.
- Replaces hard Guzzle dependency with PSR-18 Client and PSR-17 Factory discovery.
- Refactors `Factory`and `Api` to use `ClientInterface` and `RequestFactoryInterface`.
- Replaces specific `authToken` handling with generic `headers` support in `Factory` and `Api`.
- Deprecates `Factory::withAuthToken()` in favor of `withHeader()`.
- Updates all embedding functions to use the discovered PSR-18 client.

BREAKING CHANGE: `Factory::withHttpClient` has been removed.
- Adds `ChromaDB::local()` and `ChromaDB::cloud()` helper methods for easier connection configuration.
- Deprecates `ChromaDB::client()` in favor of `ChromaDB::local()->connect()` or `ChromaDB::factory()->connect()`.

BREAKING CHANGE: `ChromaDB::reset()` has been removed. Use `$client->reset()` instead.
…environment variables from ChromaServer process.
- Simplify `Api::handleErrorResponse` to align with Chroma v2 API consistency, utilizing direct JSON decoding and status code mapping (e.g., mapping 409 to `UniqueConstraintException`).
- Rename exception classes to remove redundant prefixes (e.g., `ChromaConnectionException` → `ConnectionException`).
- Standardize exception instantiation via the `ChromaException::create()` factory.
- Implement robust client-side validation in Collection methods for embeddings, `nResults`, IDs, and metadata.
- update `ApiTest`, `CollectionTest`, and `QueryFilteringTest` with comprehensive negative test scenarios.
…or parameters optional with null defaults, simplifying test calls.
Note: Collection forking is only supported for Chroma Cloud, not local Chroma instances.
- Extract embedding generation into dedicated prepareEmbeddings method
- Support mixed embeddings arrays where some items have embeddings and
  others are null, generating missing ones in batch while maintaining order
- Separate concerns: prepareEmbeddings handles generation only, validate
  handles all validation logic
…ramsey/composer-install@v3, streamline PHP setup, and remove caching steps
@CodeWithKyrian CodeWithKyrian merged commit a27f415 into main Dec 6, 2025
16 checks passed
@CodeWithKyrian CodeWithKyrian deleted the feat/chroma-v1-support branch December 6, 2025 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The v1 API is deprecated. Please use /v2 apis

2 participants