-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing Adaptive SIFT #92
Comments
Hi Chris! I had not added adaptive SIFT to this library. It is a relatively straightforward post-hoc processing of the returned outputs, based on their "values". But I understand that in its current form, the meaning of the returned "values" is a bit cryptic. The estimated posterior uncertainty is Regarding how the uncertainty estimation works: The key thing is that we use the "surrogate model" to estimate uncertainty after fine-tuning. In this way, we can estimate this uncertainty without actually having to fine-tune the model. Hope this helps |
hey, thanks for the fast implementation. When I set alpha (I tried values from the paper,
(P.S. sorry for the double post, posted from a wrong account at first) |
Hi Chris, very sorry about this. For the adaptive results from the paper, we ran experiments post-hoc. I must have made a mistake in the pr where I ported the code into this library. I'll try to get this fixed in the next few weeks! |
One thing that might happen since you mentioned that you use embeddings from phi3 is that they are in a different unit. Are your embeddings normalized? If not you might have to set very different (much smaller) alphas. |
ok this is helpful. Just realized you've been using different embedding models. Any chance you can share the experiments/code around GPT2 & Phi that's part of the paper? Is your codebase based on https://github.com/socialfoundations/tttlm ? |
Congrats to the release of the paper and library.
Could you point out which of the return values of activeft hints at approximation of the models uncertainty?
What's also not clear to me is how "SIFT estimates the uncertainty about the response to a given prompt after having been fine-tuned on some data (§3)". Empathesis is on after the model has been fine tuned, as this library is used to determine the dataset to use before fine tuning.
The text was updated successfully, but these errors were encountered: