diff --git a/PUBLICATIONS.md b/PUBLICATIONS.md index 32b76e5fe4..65d1f08e07 100644 --- a/PUBLICATIONS.md +++ b/PUBLICATIONS.md @@ -1,5 +1,9 @@ # Publications Using Cutlass +## 2024 + +- ["Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level"](https://arxiv.org/abs/2403.04690). Ali Hassani, Wen-Mei Hwu, Humphrey Shi. _arXiv_, March 2024. + ## 2023 - ["A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library"](https://arxiv.org/abs/2312.11918). Ganesh Bikshandi, Jay Shah. _arXiv_, December 2023.