Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing Adaptive SIFT #92

Open
chris-aeviator opened this issue Nov 23, 2024 · 5 comments
Open

Implementing Adaptive SIFT #92

chris-aeviator opened this issue Nov 23, 2024 · 5 comments

Comments

@chris-aeviator
Copy link

Congrats to the release of the paper and library.

Could you point out which of the return values of activeft hints at approximation of the models uncertainty?

What's also not clear to me is how "SIFT estimates the uncertainty about the response to a given prompt after having been fine-tuned on some data (§3)". Empathesis is on after the model has been fine tuned, as this library is used to determine the dataset to use before fine tuning.

@chris-aeviator chris-aeviator changed the title Adaptive SIFT Implementing Adaptive SIFT Nov 23, 2024
@jonhue
Copy link
Owner

jonhue commented Nov 23, 2024

Hi Chris!

I had not added adaptive SIFT to this library. It is a relatively straightforward post-hoc processing of the returned outputs, based on their "values". But I understand that in its current form, the meaning of the returned "values" is a bit cryptic. The estimated posterior uncertainty is sqrt(-values).
I added direct support for adaptive SIFT in #93. You can just pass your alpha parameter when instantiating the sift.Retriever object.

Regarding how the uncertainty estimation works: The key thing is that we use the "surrogate model" to estimate uncertainty after fine-tuning. In this way, we can estimate this uncertainty without actually having to fine-tune the model.

Hope this helps
Jonas

@jonhue jonhue closed this as completed Nov 23, 2024
@chris-aeviator
Copy link
Author

hey, thanks for the fast implementation. When I set alpha (I tried values from the paper, 0.15, 0.55, 2.0), but whenever I set alpha, I don't get any results from the retriever, contrary to when i set it to None. I'm working with phi3 and also the embeddings are generated from phi 3.

retriever = Retriever(index, also_query_opposite=False, alpha=0.15)

(P.S. sorry for the double post, posted from a wrong account at first)

@jonhue
Copy link
Owner

jonhue commented Nov 27, 2024

Hi Chris, very sorry about this.

For the adaptive results from the paper, we ran experiments post-hoc. I must have made a mistake in the pr where I ported the code into this library. I'll try to get this fixed in the next few weeks!

@jonhue jonhue reopened this Nov 27, 2024
@jonhue
Copy link
Owner

jonhue commented Nov 27, 2024

One thing that might happen since you mentioned that you use embeddings from phi3 is that they are in a different unit. Are your embeddings normalized? If not you might have to set very different (much smaller) alphas.

@chris-aeviator
Copy link
Author

ok this is helpful. Just realized you've been using different embedding models. Any chance you can share the experiments/code around GPT2 & Phi that's part of the paper? Is your codebase based on https://github.com/socialfoundations/tttlm ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants