-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance SDG to Support Multiple OpenAI Endpoints for Improved Performance #216
Comments
@njhill would be good to have your thoughts on this |
I think it's a good option to have in the toolbox for throughput-maximization experimentation. A wrapper client could be used which just wraps two different clients configured with different endpoints. |
This seems like a pretty normal load balancer use case? |
This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. |
My initial reaction is to echo what Russell said and that this is really the concern of a load balancer. However, is there a specific reason we need client-side load balancing and managing a pool of multiple OpenAI endpoints? Perhaps I'm overlooking a reason just using a single endpoint behind a standalone load balancer isn't ideal? |
Currently, SDG only supports a single OpenAI endpoint. However, adding support for multiple OpenAI endpoints could significantly improve overall SDG performance. We have observed nearly a 50% improvement in total SDG timing by running two replicas of the vLLM server instead of one and load balancing them internally.
Consider the following scenarios:
Scenario 1
Scenario 2
Running SDG with Scenario 1 showed nearly 50% improvement over Scenario 2. If SDG can work with multiple replicas of vLLM, we can incorporate Scenario 1 for better performance.
The text was updated successfully, but these errors were encountered: