About "conv_in" layer in the U-Net #113

liujin112 · 2024-12-04T03:33:19Z

In the training of cyclegan-turbo, conv_in was set to be trainable, but it seems that the weights of conv_in after training were not saved. Instead, only the LoRA weights of conv_in were saved. Why is that?
Similarly, in pix2pix-turbo, conv_in was replaced with a TwinConv layer, and the paper does not seem to explain the function of this part.
Thanks

The text was updated successfully, but these errors were encountered:

23wjj · 2024-12-27T06:25:02Z

Hello, I have the same question. Did you solve it?

andjoer · 2025-01-04T08:36:28Z

I think the TwinConv layer is needed to sample diverse outputs as described on page 7 and 8 of the Paper in the section "Generating diverse outputs". If you only want to have a deterministic output you don't need it.
Your point that the conv_in is not saved might be valid, since I ran into the problem that my pretrained models did not behave well during inference. I did not check the issue but I ended up with saving all parameters and now it works - the downside of course is that my checkpoint files are very large.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About "conv_in" layer in the U-Net #113

About "conv_in" layer in the U-Net #113

liujin112 commented Dec 4, 2024

23wjj commented Dec 27, 2024

andjoer commented Jan 4, 2025

About "conv_in" layer in the U-Net #113

About "conv_in" layer in the U-Net #113

Comments

liujin112 commented Dec 4, 2024

23wjj commented Dec 27, 2024

andjoer commented Jan 4, 2025