-
Notifications
You must be signed in to change notification settings - Fork 223
Pull requests: mit-han-lab/llm-awq
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Major] Fuse bias+gemm and layernorm+quantization for more efficient ViT
#254
opened Jan 13, 2025 by
Louym
Loading…
Replace FasterTransformers like KV cache layout and kernel with flash attention for better support for longer sequence
#239
opened Nov 16, 2024 by
JerryGJX
Loading…
Suggest: Add Bayesian optimization support for ratio search
#104
opened Oct 26, 2023 by
trotsky1997
Loading…
ProTip!
Mix and match filters to narrow down what you’re looking for.