Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CUDA graph to fuse kernel launches #1251

Closed
wants to merge 19 commits into from
Closed

Use CUDA graph to fuse kernel launches #1251

wants to merge 19 commits into from

Conversation

goliaro
Copy link
Collaborator

@goliaro goliaro commented Dec 24, 2023

Description of changes:

current benchmarking: url

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:

  • Closes #

This change is Reviewable

@goliaro goliaro marked this pull request as ready for review January 2, 2024 22:01
@jiazhihao
Copy link
Collaborator

I think this one can be closed since @chenzhuofu is implementing CUDA graph in the spec_scheduler branch. I will close it after @goliaro and @chenzhuofu confirm.

@jiazhihao jiazhihao added the inference Features and fixes related to the inference project. label May 31, 2024
@jiazhihao
Copy link
Collaborator

Leave this open for now. @chenzhuofu or @zikun-li will cherry-pick the memory fixes and apply them to the specscheduler branch. We will close this PR after it.

@jiazhihao jiazhihao closed this Sep 5, 2024
@goliaro goliaro deleted the cuda_graph branch November 4, 2024 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inference Features and fixes related to the inference project.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants