Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance SDG to Support Multiple OpenAI Endpoints for Improved Performance #216

Open
npalaska opened this issue Jul 25, 2024 · 5 comments
Open
Labels
enhancement New feature or request

Comments

@npalaska
Copy link
Contributor

Currently, SDG only supports a single OpenAI endpoint. However, adding support for multiple OpenAI endpoints could significantly improve overall SDG performance. We have observed nearly a 50% improvement in total SDG timing by running two replicas of the vLLM server instead of one and load balancing them internally.

Consider the following scenarios:
Scenario 1

Teacher model sharded across 2 gpus -> endpoint A
Teacher model sharded across 2 gpus -> endpoint B

Scenario 2

Teacher model sharded across 4 gpus -> endpoint A

Running SDG with Scenario 1 showed nearly 50% improvement over Scenario 2. If SDG can work with multiple replicas of vLLM, we can incorporate Scenario 1 for better performance.

@shivchander
Copy link
Member

@njhill would be good to have your thoughts on this

@njhill
Copy link
Contributor

njhill commented Jul 26, 2024

I think it's a good option to have in the toolbox for throughput-maximization experimentation. A wrapper client could be used which just wraps two different clients configured with different endpoints.

@russellb
Copy link
Member

russellb commented Aug 5, 2024

This seems like a pretty normal load balancer use case?

@nathan-weinberg nathan-weinberg added the enhancement New feature or request label Aug 20, 2024
Copy link

This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days.

@github-actions github-actions bot added the stale label Nov 20, 2024
@bbrowning
Copy link
Contributor

My initial reaction is to echo what Russell said and that this is really the concern of a load balancer. However, is there a specific reason we need client-side load balancing and managing a pool of multiple OpenAI endpoints? Perhaps I'm overlooking a reason just using a single endpoint behind a standalone load balancer isn't ideal?

@github-actions github-actions bot removed the stale label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants