Use cases, pain points, and background
Right now users have to point to some external model endpoint. We should enable users to spin up their own.
Description:
Add an option to spin up a local vllm server in VLLMModel.
Design:
Probably just responses_api_models/vllm_model app.py and requirements.txt
Out of scope:
Acceptance Criteria: