You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had tried to use the sageAttention in prefilling phase of vllm based on latest 2.0.0 branch, however, after replacing fa2, it doesn't seem to have much effect, and the end-to-end throughput performance (tokens/s) remains almost the same
RT
The text was updated successfully, but these errors were encountered: