fix(vertex_ai): improve passthrough endpoint url parsing and construction (#17402) #17526

krisxia0506 · 2025-12-05T06:27:59Z

Title

fix(vertex_ai): improve passthrough endpoint url parsing and construction and deployment filtering

Relevant issues

Fixes #17402

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

URL Parsing and Construction Improvements

litellm/llms/vertex_ai/common_utils.py:
- Added get_vertex_model_id_from_url helper function to extract the model ID from Vertex AI URLs.
- Updated construct_target_url to correctly handle and strip /v1/ and /v1beta1/ version prefixes from the requested route to prevent double versioning in the target URL.
litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py:
- Updated _base_vertex_proxy_route to improve project and location resolution. If vertex_project or vertex_location cannot be parsed from the URL, it now attempts to extract the model ID and look up the corresponding deployment in the llm_router to find the configured vertex_project and vertex_location.
tests/test_litellm/llms/vertex_ai/test_vertex_ai_common_utils.py:
- Added unit tests for get_vertex_model_id_from_url covering valid and invalid URLs.
- Added unit tests for construct_target_url verifying correct handling of /v1/ and /v1beta1/ prefixes.

Pass-Through Deployment Filtering

litellm/router.py:
- Implemented get_available_deployment_for_pass_through() method to ensure only deployments configured with use_in_pass_through=True are returned for pass-through endpoint selection.
- Implemented async_get_available_deployment_for_pass_through() for async operations with the same filtering behavior.
- Added _filter_pass_through_deployments() helper method to filter deployments by the use_in_pass_through flag.
- Both methods respect the configured routing strategy (simple-shuffle, least-busy, latency-based-routing, usage-based-routing) while only considering pass-through enabled deployments.
litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py:
- Updated _base_vertex_proxy_route to use the new get_available_deployment_for_pass_through() method instead of get_available_deployment() to ensure pass-through filtering is applied consistently.
tests/test_litellm/proxy/pass_through_endpoints/test_vertex_passthrough_load_balancing.py:
- Updated existing async test to verify use of get_available_deployment_for_pass_through().
- Added test test_get_available_deployment_for_pass_through_filters_correctly() to verify correct filtering of pass-through deployments.
- Added test test_get_available_deployment_for_pass_through_no_deployments() to verify proper error handling when no pass-through deployments exist.
- Added test test_get_available_deployment_for_pass_through_load_balancing() to verify load balancing respects deployment RPM weights.
- Added test test_async_get_available_deployment_for_pass_through() to verify async functionality.

Summary

This PR improves the Vertex AI pass-through endpoint handling in two main areas:

URL Parsing & Configuration: Properly parses model IDs from URLs and looks up vertex_project and vertex_location from router deployments when not present in the URL.
Deployment Filtering: Implements dedicated pass-through deployment selection methods that ensure only deployments explicitly configured for pass-through are used, while maintaining proper load balancing across them.

These changes ensure pass-through endpoints are more robust and respect deployment configuration, while enabling proper load balancing for pass-through requests.

vercel · 2025-12-05T06:28:05Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
litellm	Ready	Preview, Comment	Dec 15, 2025 4:39am

krisxia0506 · 2025-12-09T08:15:52Z

Hi @krrishdholakia 👋

This PR fixes an issue where LiteLLM does not correctly load vertex_project and vertex_location for Vertex AI passthrough when using the google-genai Python SDK.

In the SDK, if the user does not explicitly provide vertex_project and vertex_location, then the request will not contain these values. In this case, LiteLLM should fall back to the values configured in the YAML:

use_in_pass_through: true
vertex_project: ...
vertex_location: ...

The current behavior ignores these config values, resulting in passthrough URLs like:

projects/None/locations/None/...

This PR ensures LiteLLM correctly loads the configured project and location so that Vertex AI passthrough works as documented.

We rely on Vertex passthrough in production, so it would be great if this could be reviewed or assigned to another maintainer.
Happy to provide additional tests or documentation updates if needed.

Thanks again for the support!

krrishdholakia · 2025-12-12T12:19:49Z

litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py

        vertex_location=vertex_location,
    )

+    if vertex_project is None or vertex_location is None:


we should make sure user has access to the model before allowing request to go through

can you add the extraction logic here -

litellm/litellm/proxy/auth/user_api_key_auth.py

Line 943 in 5df701d

model = get_model_from_request(request_data, route)

or some version of the logic maybe in your code block?

maybe extract model and just run can_key_call_model - to confirm valid access before proceeding

Sorry, I missed this question "about whether the key has permission to access the model"
I will fix it.

I noticed that the original pass_through_endpoints logic did not check whether the key had permission to call the model. I have fixed this issue in the new PR #17970.

…tion (BerriAI#17402)

Add a test that verifies _base_vertex_proxy_route uses get_available_deployment for proper load balancing instead of get_model_list. This ensures the correct deployment is selected from the router and vertex credentials are properly fetched. Also refactor the implementation to: - Use get_available_deployment instead of get_model_list - Add error handling for deployment retrieval - Improve code structure with try-except block

Add dedicated methods to filter and select deployments for pass-through endpoints: - Implement get_available_deployment_for_pass_through() to ensure only deployments with use_in_pass_through=True are considered - Implement async_get_available_deployment_for_pass_through() for async operations - Add _filter_pass_through_deployments() helper method to filter by use_in_pass_through flag - Update vertex pass-through route to use the new dedicated method This ensures pass-through endpoints respect the use_in_pass_through configuration and apply proper load balancing strategy only to configured deployments. Add comprehensive tests to verify filtering and load balancing behavior.

vercel bot deployed to Preview December 5, 2025 06:29 View deployment

krisxia0506 force-pushed the fix/vertex-passthrough-url-parsing branch from cb2c137 to 8a4f674 Compare December 8, 2025 01:54

vercel bot deployed to Preview December 8, 2025 01:56 View deployment

vercel bot deployed to Preview December 9, 2025 07:31 View deployment

krrishdholakia reviewed Dec 12, 2025

View reviewed changes

krisxia0506 added 3 commits December 15, 2025 10:02

fix(vertex_ai): improve passthrough endpoint url parsing and construc…

ecbe6cc

…tion (BerriAI#17402)

krisxia0506 force-pushed the fix/vertex-passthrough-url-parsing branch from d80f9ae to 5e7bb0c Compare December 15, 2025 04:37

vercel bot deployed to Preview December 15, 2025 04:39 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(vertex_ai): improve passthrough endpoint url parsing and construction (#17402) #17526

fix(vertex_ai): improve passthrough endpoint url parsing and construction (#17402) #17526

krisxia0506 commented Dec 5, 2025 •

edited

Loading

Uh oh!

vercel bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

krisxia0506 commented Dec 9, 2025

Uh oh!

krrishdholakia Dec 12, 2025

Uh oh!

krrishdholakia Dec 12, 2025

Uh oh!

krisxia0506 Dec 14, 2025

Uh oh!

krisxia0506 Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix(vertex_ai): improve passthrough endpoint url parsing and construction (#17402) #17526

Are you sure you want to change the base?

fix(vertex_ai): improve passthrough endpoint url parsing and construction (#17402) #17526

Conversation

krisxia0506 commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Relevant issues

Pre-Submission checklist

Type

Changes

URL Parsing and Construction Improvements

Pass-Through Deployment Filtering

Summary

Uh oh!

vercel bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krisxia0506 commented Dec 9, 2025

Uh oh!

krrishdholakia Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

krrishdholakia Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

krisxia0506 Dec 14, 2025

Choose a reason for hiding this comment

Uh oh!

krisxia0506 Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

krisxia0506 commented Dec 5, 2025 •

edited

Loading

vercel bot commented Dec 5, 2025 •

edited

Loading