Skip to content

Commit

Permalink
Add Faster Neighborhood Attention to pubs (#1471)
Browse files Browse the repository at this point in the history
  • Loading branch information
alihassanijr authored Jul 10, 2024
1 parent d6580c3 commit c5239d8
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions PUBLICATIONS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Publications Using Cutlass

## 2024

- ["Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level"](https://arxiv.org/abs/2403.04690). Ali Hassani, Wen-Mei Hwu, Humphrey Shi. _arXiv_, March 2024.

## 2023

- ["A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library"](https://arxiv.org/abs/2312.11918). Ganesh Bikshandi, Jay Shah. _arXiv_, December 2023.
Expand Down

0 comments on commit c5239d8

Please sign in to comment.