Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About "conv_in" layer in the U-Net #113

Open
liujin112 opened this issue Dec 4, 2024 · 2 comments
Open

About "conv_in" layer in the U-Net #113

liujin112 opened this issue Dec 4, 2024 · 2 comments

Comments

@liujin112
Copy link

In the training of cyclegan-turbo, conv_in was set to be trainable, but it seems that the weights of conv_in after training were not saved. Instead, only the LoRA weights of conv_in were saved. Why is that?
Similarly, in pix2pix-turbo, conv_in was replaced with a TwinConv layer, and the paper does not seem to explain the function of this part.
Thanks

@23wjj
Copy link

23wjj commented Dec 27, 2024

Hello, I have the same question. Did you solve it?

@andjoer
Copy link

andjoer commented Jan 4, 2025

I think the TwinConv layer is needed to sample diverse outputs as described on page 7 and 8 of the Paper in the section "Generating diverse outputs". If you only want to have a deterministic output you don't need it.
Your point that the conv_in is not saved might be valid, since I ran into the problem that my pretrained models did not behave well during inference. I did not check the issue but I ended up with saving all parameters and now it works - the downside of course is that my checkpoint files are very large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants