Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NVIDIA apex support and gradient checkpointing to reduce memory footprint #1090

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

seovchinnikov
Copy link

I've added NVIDIA apex support and checkpointing (https://pytorch.org/docs/stable/checkpoint.html) mechanism to reduce memory footprint.

You can run it with --checkpointing --opt_level "O2" and increased input crop size (I was able to run CycleGAN with up to 896 on my 2080 RTX). Checkpointing is only used for CycleGAN for now (can be improved further).
Please note that it was tested on pytorch 1.7 nightly build, and behavior of apex is unstable on old versions.

@junyanz
Copy link
Owner

junyanz commented Jul 11, 2020

Great feature! I am wondering if you can get the same results with and without apex and gradient checkpointing.

@seovchinnikov
Copy link
Author

I think we should run base tests and check it against the baselines

@vict0rsch
Copy link

Note that amp is part of pytorch as of 1.6 => https://pytorch.org/docs/stable/notes/amp_examples.html#amp-examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants