Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM #95

Closed
wants to merge 34 commits into from

Conversation

ishaan-jaff
Copy link

Why are these changes needed?

This PR adds support for the above mentioned LLMs using LiteLLM https://github.com/BerriAI/litellm/
LiteLLM is a lightweight package to simplify LLM API calls - use any llm as a drop in replacement for gpt-3.5-turbo.

Example

from litellm import completion

## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages)

# anthropic call
response = completion(model="claude-instant-1", messages=messages)

Related issue number

Checks

@ishaan-jaff ishaan-jaff changed the title Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2 CodeLlama (100+LLMs) - using LiteLLM Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM Oct 4, 2023
@ishaan-jaff
Copy link
Author

Addressing:
#44
#45
#34
#46

@ishaan-jaff
Copy link
Author

We also support tracking max_tokens, cost, caching I noticed this repo has some utils for this https://docs.litellm.ai/docs/token_usage

This was referenced Oct 4, 2023
@ishaan-jaff
Copy link
Author

@sonichi @thinkall can I get a review on this PR ?

@sonichi
Copy link
Contributor

sonichi commented Oct 4, 2023

Thank you. I'm looking for reviewers. Are you on discord?

@derekbar90
Copy link

derekbar90 commented Oct 4, 2023

Model names don’t always relate to litellm config names. Not sure if using the model name is the perfect variable fit here

@BeibinLi
Copy link
Collaborator

BeibinLi commented Oct 4, 2023

A substitution with "litellm.completion" may not be adequate in this scenario. We may need additional checks. The OpenAI call is still preferred in most cases.

@BeibinLi
Copy link
Collaborator

BeibinLi commented Oct 4, 2023

Another suggestion is: we should integrate LiteLLM natively with OAI_CONFIG_LIST, so that users don't need to worry about the backend.

@ishaan-jaff
Copy link
Author

@sonichi I'm on discord, my username is: 'ishaanberri.ai'
If you can't find that, the LiteLLM discord is here: https://discord.com/invite/wuPM9dRgDw /

I can DM you once you join

@ishaan-jaff
Copy link
Author

Model names don’t always relate to litellm config names. Not sure if using the model name is the perfect variable fit here

@derekbar90 what do you mean by this ?

@ishaan-jaff
Copy link
Author

Another suggestion is: we should integrate LiteLLM natively with OAI_CONFIG_LIST, so that users don't need to worry about the backend.

@BeibinLi that's a great suggestion I can tackle that in this PR. I can read the OAI_CONFIG_LIST and pass that to the litellm.completion call

@AaronWard AaronWard self-requested a review October 5, 2023 11:08
@inconnu26
Copy link

I just made it work with litellm, and it's great !. But I had to change quite a lot of code from the project...

@AaronWard
Copy link
Contributor

I just made it work with litellm, and it's great !. But I had to change quite a lot of code from the project...

@inconnu26 You beat me too it, was just in the middle of making some code changes - can you share the changes you made to this PR? I think @ishaan-jaff will need to give permission to work off his branch.

@UweR70
Copy link

UweR70 commented Oct 5, 2023

You are adding/ importing "litellm" in "completion.py"
but did not adjust the requirement files
(like adding "pip install litellm")?

@krrishdholakia
Copy link

Hey @inconnu26 is there a best way to communicate? If there's any improvements we can make on our end to reduce code changes - I'd love to help with that.

Alternatively if you have a way I can see the code - that'd be helpful!

@krrishdholakia
Copy link

krrishdholakia commented Oct 11, 2023

@sonichi @BeibinLi would like to make progress on this PR. Can you please approve the required changes to approve this PR.

Action Items

Next Steps

Upon completion of action items (stated above), this PR will be merged.

@sonichi
Copy link
Contributor

sonichi commented Oct 11, 2023

@sonichi @BeibinLi would like to make progress on this PR. Can you please approve the required changes to approve this PR.

Action Items

Next Steps

Upon completion of action items (stated above), this PR will be merged.

The plan looks good to me. Other reviewers' comments still need to be addressed before merging.

@krrishdholakia
Copy link

krrishdholakia commented Oct 11, 2023

@sonichi what are the remaining unaddressed issues?

@sonichi
Copy link
Contributor

sonichi commented Oct 12, 2023

@sonichi what are the remaining unaddressed issues?

There are unresolved conversations in the reviews.

# print("using cached response")
cls._book_keeping(config, response)
return response
except:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catch a more specific exception.

@sonichi sonichi requested a review from a team October 15, 2023 15:04
@gagb
Copy link
Collaborator

gagb commented Oct 17, 2023

Related: LMStudio works great for local models without changing an AutoGen code. I tried mistral 7b on m1 Mac.

https://www.youtube.com/watch?v=10FCv-gCKug

@gagb
Copy link
Collaborator

gagb commented Oct 24, 2023

This PR is now 3 weeks old. @ishaan-jaff and @krrishdholakia please let us know when all issues in conversation above have been addressed.

@gagb
Copy link
Collaborator

gagb commented Oct 25, 2023

It seems that LiteLLM supports Bedrock models?

#339

@ChrisDryden
Copy link

Is there documentation available on this setup for connecting to AWS Bedrock? Was looking for a "Hello World" style snippet to set up Titan or Claude?

@ishaan-jaff
Copy link
Author

ishaan-jaff commented Nov 3, 2023

@ChrisDryden here's how you can use litellm proxy for this
docs: https://docs.litellm.ai/docs/proxy_server

litellm --model bedrock/anthropic.claude-v2

ensure the following vars are in your env for bedrock: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION_NAME

Use proxy with autogen

from autogen import AssistantAgent, UserProxyAgent, oai
config_list=[
    {
        "model": "my-fake-model",
        "api_base": "http://localhost:8000",  #litellm compatible endpoint
        "api_type": "open_ai",
        "api_key": "NULL", # just a placeholder
    }
]

response = oai.Completion.create(config_list=config_list, prompt="Hi")
print(response) # works fine

llm_config={
    "config_list": config_list,
}

assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy")
user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list)

@xsa-dev
Copy link

xsa-dev commented Nov 12, 2023

  1. How can we boost this PR?
  2. Why hasn't it been completed yet?

This is incredible!

@sonichi
Copy link
Contributor

sonichi commented Nov 12, 2023

As a result of the openai v1 breaking changes, I expect a new PR is needed to be based on the latest main.

@xsa-dev
Copy link

xsa-dev commented Nov 14, 2023

As a result of the openai v1 breaking changes, I expect a new PR is needed to be based on the latest main.

@ishaan-jaff
Could you please provide an updated PR with the latest OpenAI endpoints version?

@austinmw
Copy link

Hi, any updates on this?

[autogen.oai.completion: 11-23 15:22:06] {785} WARNING - Completion.create is deprecated in pyautogen v0.2 and openai>=1. The new openai requires initiating a client for inference. Please refer to https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#api-unification
Traceback (most recent call last):
File "/Users/austinwelch/Desktop/autogen/test_bedrock.py", line 11, in
response = oai.Completion.create(config_list=config_list, prompt="Hi")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/austinwelch/Desktop/autogen/autogen/oai/completion.py", line 791, in create
raise ERROR
AssertionError: (Deprecated) The autogen.Completion class requires openai<1 and diskcache.

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 14, 2024

@ishaan-jaff I love this PR. Currently AutoGen already supports litellm proxy, but would you be interested to write a blog post on integration between LiteLLM and AutoGen? There are some youtube videos on this topic: https://www.youtube.com/watch?v=y7wMTwJN7rA

@Neo9061
Copy link

Neo9061 commented May 5, 2024

@ChrisDryden here's how you can use litellm proxy for this docs: https://docs.litellm.ai/docs/proxy_server

litellm --model bedrock/anthropic.claude-v2

ensure the following vars are in your env for bedrock: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION_NAME

Use proxy with autogen

from autogen import AssistantAgent, UserProxyAgent, oai
config_list=[
    {
        "model": "my-fake-model",
        "api_base": "http://localhost:8000",  #litellm compatible endpoint
        "api_type": "open_ai",
        "api_key": "NULL", # just a placeholder
    }
]

response = oai.Completion.create(config_list=config_list, prompt="Hi")
print(response) # works fine

llm_config={
    "config_list": config_list,
}

assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy")
user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list)

Came across this comment. I am able to use litellm proxy to let autogen working with AWS Bedrock. However, it won't work with RAG based setup (tutorial). It says input parameter functions is not supported by bedrock.

Any thought on any work around?

@ekzhu
Copy link
Collaborator

ekzhu commented May 6, 2024

I am able to use litellm proxy to let autogen working with AWS Bedrock. However, it won't work with RAG based setup (tutorial). It says input parameter functions is not supported by bedrock.

@Neo9061 thanks for pointing this out. I looked at the notebook and I think you just need to change this:

 for caller in [pm, coder, reviewer]:
        d_retrieve_content = caller.register_for_llm(
            description="retrieve content for code generation and question answering.", api_style="function"
        )(retrieve_content)

to remove api_style="function". So it will be using the tool call API instead. cc @thinkall for awareness

If this works could you create a PR to fix this notebook?

@ekzhu
Copy link
Collaborator

ekzhu commented May 6, 2024

I am closing this PR as it is stalled. We have LiteLLM support documented here: https://microsoft.github.io/autogen/docs/topics/non-openai-models/local-litellm-ollama

@ishaan-jaff let me know if you have any more questions.

@ekzhu ekzhu closed this May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.