About the dimension projection #33

shaonanqinghuaizongshishi · 2022-03-24T06:51:16Z

The linear projection after the self attention:
bs = self_attention.size(0)
self_attention = self_attention.view(bs, -1)
linear_proj = F.relu(self.linear_projection(self_attention))

From the paper, they said "We project the self-attended neighbor encodings to a LARGER 4x2d dimensional space", so if you flatten out the last two dimensions of "self_attention" before the projection, how can you make sure neighbor < 4?

In my opinion, we should not flatten the last two dimensions before projection, we do projection on the last dimension whose size is 2d, and 2d < 4x2d, so we are projecting it to a larger space.

Please point it out if I understand this wrong at some place, or you do this on purpose for some reason.

The text was updated successfully, but these errors were encountered:

Praneet9 · 2022-03-27T12:49:01Z

@shaonanqinghuaizongshishi I think you are mistaking 4 to be the number of neighbors here. 4 is an arbitrary number chosen by the authors without much explanation on it. So the linear projections are designed such that, no matter what sizes you use for neighbors and embedding size, the shapes will be taken care of.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the dimension projection #33

About the dimension projection #33

shaonanqinghuaizongshishi commented Mar 24, 2022 •

edited

Loading

Praneet9 commented Mar 27, 2022

About the dimension projection #33

About the dimension projection #33

Comments

shaonanqinghuaizongshishi commented Mar 24, 2022 • edited Loading

Praneet9 commented Mar 27, 2022

shaonanqinghuaizongshishi commented Mar 24, 2022 •

edited

Loading