-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA Error: Out of Memory #422
Comments
What is the size of your training image? |
Hi JunYanz, Thanks for the note. :) The images are coming out of the webcam at 1920 x 1080, and I'm saving them as 480x360 sets (1/4 scale). I'm then joining these together to form 960x360 images with an A/B pair. Brian |
It seems that 24GB can fit |
Thank you.
So would this imply 512x256? Given that A and B should be collated in one
image?
Or should I have A and B in two separate images?
…On Wed, Nov 7, 2018 at 9:14 PM Jun-Yan Zhu ***@***.***> wrote:
It seems that 24GB can fit 480x360 images. Maybe you can further reduce
the size of training images (to 256x256).
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#422 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Aqq44xlciK8yUYrkZdSKF7unjh-jNkumks5ustzLgaJpZM4YNMwr>
.
|
Yeah, |
This was the output from edges to image. I only trained it on one image for the base settings, so I would have imagined it would be fairly accurate. Any sense of why the BtoA didn't generate the image exactly? Would I see more fidelity in terms of painting in colors. Canny generates black background with white edges, but I notice Edges2Cats is black edges on white. Should I invert from that perspective? |
This is correct. Even in a NVIDIA® Tesla® V100 32GB, it is hard to work with images which are larger than 700 by 700. I converted the code to mixed half precision (using NVIDIA Apex) which allows training on 1200 x 1200 images, and am working on gradient checkpointing and possibly model parallelism, with the goal of reaching 2000 x 2000 training (training on small resolution and generating large images seems to not work well). When everything is tested and working, I can make a pull request if you think that might be helpful. |
That would be excellent Ismail!
Running nvidia-smi shows that while I’m using the GPU at 80+%, the
effective use hovers around 3/24gb. Not sure what is reserving the rest.
In any case, I downsampled everything to 256x256 and it worked. Now around
Epoch 50 so will let you know how it goes when done (30 mins/epoch).
…On Mon, Nov 12, 2018 at 10:56 PM Ismail Elezi ***@***.***> wrote:
It seems that 24GB can fit 480x360 images. Maybe you can further reduce
the size of training images (to 256x256).
This is correct. Even in a NVIDIA® Tesla® V100 32GB, it is hard to work
with images which are larger than 700 by 700. I converted the code to half
precision which allows training on 1200 x 1200 images, and am working on
gradient checkpointing and possibly model parallelism, with the goal of
reaching 2000 x 2000 training (training on small resolution and generating
large images seems to not work well).
When everything is tested and working, I can make a pull request if you
think that might be helpful.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#422 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Aqq449_8uJe0dvlT2A9zubrm4FIP1i8zks5uuYwagaJpZM4YNMwr>
.
|
Thank you both for the help. I've got this mostly working, and I can test it using the script provided on this Git. Question: Is there a way to run this from an image coming from OpenCV webcam? Currently the test needs to be run from .sh with a number of arguments / parameters that are embedded in a variety of different files (test, test_options, base_options, visualizer, etc.) and I'm not quite sure how to pull all of what is required out to run .pth model that has been created on a real-time feed. I assume this is possible, just not sure how. |
I think it is possible. I think you need to rewrite the |
@TheRevanchist : Does it hurt performance when use mixed precision (using NVIDIA Apex)? |
@John1231983 , not really. I didn't do some quantitative evaluation (like inception score for example), but just visually looking at them, they are as good as the images trained with fp32. However, if the images become too big (thousands of pixels on both direction), then the results are not that good, but that is a matter of network architecture, not mixed precision. If you want big images, you should consider using something like progressive GAN types of architecture. Also, I trained other nets for different problems with mixed precision (always using Apex), and it works like a charm. |
Hi! model, optimizer = amp.initialize(model, optimizer) so I am unsure how to fit these together. Thank you for your time! |
@junyanz new a
could somebody reopen the issue ? @junyanz |
apex support torch.nn.Module list as reference. So just like this: |
I've added apex support and checkpointing (https://pytorch.org/docs/stable/checkpoint.html) mechanism to reduce memory footprint to my fork https://github.com/seovchinnikov/pytorch-CycleGAN-and-pix2pix |
Good work! |
Would you like to send a PR? If you are busy, I can add apex to the official repo. |
@junyanz thanks, I will send PR, just need to test it a little bit more locally to be sure everything is ok |
Hi Team,
I'm in the process of trying to train a pix2pix model on an AtoB set (edges) where I've already structured these in a montage (A on one side, B on the other side, collated into one image). I have roughly 12,000 images in my training set that I'd like to use. Batch_size is already 1, so I can't reduce that further. I've turned off the visualizer but still have the error.
From nvidia-smi, I find that GPU utilization spikes just after the Network was initialized (54.414M and 2.769M parameters for Network G and Network D respectively).
This is the error:
`
I'm running Windows 10, a Quadro M6000 with 24GB of RAM. Python 3.5.5, CUDA 9.2, Pytorch 0.4.1 (for Cuda92).
Any ideas? I'm at a loss...
Brian
The text was updated successfully, but these errors were encountered: