Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the dimension projection #33

Open
shaonanqinghuaizongshishi opened this issue Mar 24, 2022 · 1 comment
Open

About the dimension projection #33

shaonanqinghuaizongshishi opened this issue Mar 24, 2022 · 1 comment

Comments

@shaonanqinghuaizongshishi
Copy link

shaonanqinghuaizongshishi commented Mar 24, 2022

The linear projection after the self attention:
bs = self_attention.size(0)
self_attention = self_attention.view(bs, -1)
linear_proj = F.relu(self.linear_projection(self_attention))

From the paper, they said "We project the self-attended neighbor encodings to a LARGER 4x2d dimensional space", so if you flatten out the last two dimensions of "self_attention" before the projection, how can you make sure neighbor < 4?

In my opinion, we should not flatten the last two dimensions before projection, we do projection on the last dimension whose size is 2d, and 2d < 4x2d, so we are projecting it to a larger space.

Please point it out if I understand this wrong at some place, or you do this on purpose for some reason.

@Praneet9
Copy link
Owner

@shaonanqinghuaizongshishi I think you are mistaking 4 to be the number of neighbors here. 4 is an arbitrary number chosen by the authors without much explanation on it. So the linear projections are designed such that, no matter what sizes you use for neighbors and embedding size, the shapes will be taken care of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants