Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Functionality in LLMBlock to Override Global OpenAI Client Variable #217

Open
npalaska opened this issue Jul 25, 2024 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@npalaska
Copy link
Contributor

Add functionality in LLMBlock within the pipeline to override the global OpenAI client variable. This enhancement will allow us to support running multiple OpenAI clients for different LLMBlock instances if desired. The primary intention is to run LLMBlock inference calls against a model deployment tailored to serve specific inference requests.

Currently, in vLLM, certain LoRA inference calls do not support specific performance optimization flags. By separating these inference calls from the non-LoRA inference calls, we can deploy multiple instances of vLLM, each optimized for different types of inference calls. This would ensure better performance.

@nathan-weinberg nathan-weinberg added the enhancement New feature or request label Aug 20, 2024
Copy link

This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days.

@github-actions github-actions bot added the stale label Nov 20, 2024
@bbrowning
Copy link
Contributor

Is this still an issue? It would not be trivial to wire up multiple functioning OpenAI clients with the ability to select which client per block of a Pipeline. We're redoing the internals of the pipeline config, so if this is something we really need then we should revisit it after the larger pipeline work lands.

@github-actions github-actions bot removed the stale label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants