Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward Transfer Issue #13

Open
zuael opened this issue Mar 5, 2023 · 10 comments
Open

Forward Transfer Issue #13

zuael opened this issue Mar 5, 2023 · 10 comments

Comments

@zuael
Copy link

zuael commented Mar 5, 2023

Hi! We are following your excellent work.

We would like to know more clearly the details of your experiments on CIRAR-100 to calculate Forward Transfer, such as how the accuracy of the random model on each task is obtained.

If we understand correctly, since the random seed is fixed, then the accuracy of the random model should be fixed as well. Is it possible to provide the accuracy of the random model on five tasks for reference.

Thanks!

@DonkeyShot21
Copy link
Owner

Hi! I have just checked my logs and I don't seem to have saved that experiment. I found the imagenet100 run with random features instead, and it reached 15.34% after linear evaluation. For CIFAR you can just run a linear evaluation without loading the pre-trained checkpoint. You can do that with solo-learn as well if you prefer https://github.com/vturrisi/solo-learn. I think the random seed does not matter that much, but you can run multiple times and average if you find it has high variance.

@zuael
Copy link
Author

zuael commented Mar 5, 2023

Thanks for the reply.

Since there are no scripts for linear eval on CIFAR-100 in this project, if we understand correctly, the results of Table 2 in the paper come from the online linear evaluation of main_pretain.py.

So we would like to confirm whether the results of Table 2 are online or offline.

Thank you again!

@DonkeyShot21
Copy link
Owner

if the backbone is randomly initialized and frozen there is no difference between online and offline (except augmentations that should not matter much).

@zuael
Copy link
Author

zuael commented May 19, 2023

Thanks for the reply!

@zuael
Copy link
Author

zuael commented May 19, 2023

Hi!
Thank you for replying to the above question; I have another question I'd like to ask you. When you run LUMP on CIFAR100, how do you set the hyperparameters of LUMP, such as the size of the replay buffer and the intensity of interpolation $\alpha$?

@DonkeyShot21
Copy link
Owner

Hello! From the paper:

LUMP uses k-NN evaluation, therefore we adapt the code provided by the authors to run in our code base. For Lin et al., we compare directly with their published results, since they use the same evaluation protocol. We perform hyperparameter tuning for all baselines, searching over 5 values for the distillation loss weights of POD and Less-Forget, 3 values for the weight of the regularization in EWC and 3 replay batch sizes for replay methods. The size of the replay buffer is 500 samples for all replay based methods.

@zuael
Copy link
Author

zuael commented May 23, 2023

Thanks for your reply, this is very detailed!
In LUMP, there is another hyperparameter $\alpha$ which is used to control the distribution of the mix ratio.
I am currently setting $\alpha$ to 0.1, but the experimental results have not reached the level reported in your paper.
If you could tell me this, it would help me a lot. Thank you again!

@DonkeyShot21
Copy link
Owner

IIRC all the other parameters were left as per their default value in the respective paper/code

@zuael
Copy link
Author

zuael commented May 27, 2023

We run Barlow Twins+LUMP on CIFAR-100 and only get an accuracy of 55.9%, which is 1.9% lower than the result reported in the paper. We wonder if we made a difference when migrating the LUMP code to your project, e.g., the buffer construction or the augmentation methods for previous samples.

  • For buffer construction, we use the reservoir sampling algorithm following the code provided by the LUMP's authors, and store the original image and one of its augmentation views in the memory buffer.

  • For augmentation methods, after randomly sampling the original image and its augmentation view from the memory buffer, we take the following augmentation method to the original image:

transform = transforms.Compose(
              [
                  transforms.ToPILImage(),
                  transforms.RandomResizedCrop(32, scale=(0.08, 1.0), ratio=(3.0/4.0,4.0/3.0), interpolation=Image.BICUBIC),
                  transforms.RandomHorizontalFlip(),
                  transforms.ToTensor(),
                  transforms.Normalize((0.4914, 0.4822, 0.4465), (0.247, 0.243, 0.261)),
                  ]
              )

We would like to check if this is the same as yours, thanks!

@DonkeyShot21
Copy link
Owner

Hello,

  • I used reservoir sampling as well
  • I also used color jittering (basically SimCLR augmentations)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants