You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When you tested the ATS and EvoViT pruning methods, how did you exactly incorporate the CLS token?
As you mention the CLS token is not "natural" for dense tasks, but given you use a DeiT backbone, you should have it from there. Do you simply reuse the DeiT CLS token (even if it is not trained during the VIT Adapter dense training), or do you initialize a new random token after the dense training?
The text was updated successfully, but these errors were encountered:
Hi, we initialize a random CLS token for these methods. Although there is no explicit supervision for the CLS token, we found that it can still effectively serve as the selector, as discussed in Appendix D of the paper.
When you tested the ATS and EvoViT pruning methods, how did you exactly incorporate the CLS token?
As you mention the CLS token is not "natural" for dense tasks, but given you use a DeiT backbone, you should have it from there. Do you simply reuse the DeiT CLS token (even if it is not trained during the VIT Adapter dense training), or do you initialize a new random token after the dense training?
The text was updated successfully, but these errors were encountered: