-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM #95
Conversation
We also support tracking max_tokens, cost, caching I noticed this repo has some utils for this https://docs.litellm.ai/docs/token_usage |
Thank you. I'm looking for reviewers. Are you on discord? |
|
A substitution with "litellm.completion" may not be adequate in this scenario. We may need additional checks. The OpenAI call is still preferred in most cases. |
Another suggestion is: we should integrate LiteLLM natively with OAI_CONFIG_LIST, so that users don't need to worry about the backend. |
@sonichi I'm on discord, my username is: 'ishaanberri.ai' I can DM you once you join |
@derekbar90 what do you mean by this ? |
@BeibinLi that's a great suggestion I can tackle that in this PR. I can read the OAI_CONFIG_LIST and pass that to the litellm.completion call |
I just made it work with litellm, and it's great !. But I had to change quite a lot of code from the project... |
@inconnu26 You beat me too it, was just in the middle of making some code changes - can you share the changes you made to this PR? I think @ishaan-jaff will need to give permission to work off his branch. |
You are adding/ importing "litellm" in "completion.py" |
Hey @inconnu26 is there a best way to communicate? If there's any improvements we can make on our end to reduce code changes - I'd love to help with that. Alternatively if you have a way I can see the code - that'd be helpful! |
@sonichi @BeibinLi would like to make progress on this PR. Can you please approve the required changes to approve this PR. Action Items
Next StepsUpon completion of action items (stated above), this PR will be merged. |
The plan looks good to me. Other reviewers' comments still need to be addressed before merging. |
@sonichi what are the remaining unaddressed issues? |
There are unresolved conversations in the reviews. |
# print("using cached response") | ||
cls._book_keeping(config, response) | ||
return response | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catch a more specific exception.
Related: LMStudio works great for local models without changing an AutoGen code. I tried mistral 7b on m1 Mac. |
This PR is now 3 weeks old. @ishaan-jaff and @krrishdholakia please let us know when all issues in conversation above have been addressed. |
It seems that LiteLLM supports Bedrock models? |
Is there documentation available on this setup for connecting to AWS Bedrock? Was looking for a "Hello World" style snippet to set up Titan or Claude? |
@ChrisDryden here's how you can use litellm proxy for this litellm --model bedrock/anthropic.claude-v2 ensure the following vars are in your env for bedrock: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION_NAME Use proxy with autogenfrom autogen import AssistantAgent, UserProxyAgent, oai
config_list=[
{
"model": "my-fake-model",
"api_base": "http://localhost:8000", #litellm compatible endpoint
"api_type": "open_ai",
"api_key": "NULL", # just a placeholder
}
]
response = oai.Completion.create(config_list=config_list, prompt="Hi")
print(response) # works fine
llm_config={
"config_list": config_list,
}
assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy")
user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list) |
This is incredible! |
As a result of the openai v1 breaking changes, I expect a new PR is needed to be based on the latest main. |
@ishaan-jaff |
Hi, any updates on this?
|
@ishaan-jaff I love this PR. Currently AutoGen already supports litellm proxy, but would you be interested to write a blog post on integration between LiteLLM and AutoGen? There are some youtube videos on this topic: https://www.youtube.com/watch?v=y7wMTwJN7rA |
Came across this comment. I am able to use litellm proxy to let autogen working with AWS Bedrock. However, it won't work with RAG based setup (tutorial). It says input parameter Any thought on any work around? |
@Neo9061 thanks for pointing this out. I looked at the notebook and I think you just need to change this: for caller in [pm, coder, reviewer]:
d_retrieve_content = caller.register_for_llm(
description="retrieve content for code generation and question answering.", api_style="function"
)(retrieve_content) to remove If this works could you create a PR to fix this notebook? |
I am closing this PR as it is stalled. We have LiteLLM support documented here: https://microsoft.github.io/autogen/docs/topics/non-openai-models/local-litellm-ollama @ishaan-jaff let me know if you have any more questions. |
Why are these changes needed?
This PR adds support for the above mentioned LLMs using LiteLLM https://github.com/BerriAI/litellm/
LiteLLM is a lightweight package to simplify LLM API calls - use any llm as a drop in replacement for gpt-3.5-turbo.
Example
Related issue number
Checks