You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Reading your paper and your StableDiffusion/Hidden code, I noticed that you are using the LAMB optimizer…
Why did you choose to use LAMB instead of AdamW / RAdam?
I tried to replicate your experiment with the optimizer AdamW/ Radam with the free scheduler (https://github.com/facebookresearch/schedule_free) but is not converging as fast as your method with LAMB with the cosine scheduler.
Best regards,
Léo
The text was updated successfully, but these errors were encountered:
Hello, I tried a lot of optimizers and different hparams. This was the one that gave the best result, I don't particularly know why since the optimizer was built specifically for large batch sizes.
Hello,
Reading your paper and your StableDiffusion/Hidden code, I noticed that you are using the LAMB optimizer…
Why did you choose to use LAMB instead of AdamW / RAdam?
I tried to replicate your experiment with the optimizer AdamW/ Radam with the free scheduler (https://github.com/facebookresearch/schedule_free) but is not converging as fast as your method with LAMB with the cosine scheduler.
Best regards,
Léo
The text was updated successfully, but these errors were encountered: