Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM #95

ishaan-jaff · 2023-10-04T01:10:58Z

Why are these changes needed?

This PR adds support for the above mentioned LLMs using LiteLLM https://github.com/BerriAI/litellm/
LiteLLM is a lightweight package to simplify LLM API calls - use any llm as a drop in replacement for gpt-3.5-turbo.

Example

from litellm import completion

## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages)

# anthropic call
response = completion(model="claude-instant-1", messages=messages)

Related issue number

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

ishaan-jaff · 2023-10-04T01:13:21Z

Addressing:
#44
#45
#34
#46

ishaan-jaff · 2023-10-04T01:14:04Z

We also support tracking max_tokens, cost, caching I noticed this repo has some utils for this https://docs.litellm.ai/docs/token_usage

ishaan-jaff · 2023-10-04T01:15:52Z

@sonichi @thinkall can I get a review on this PR ?

sonichi · 2023-10-04T15:09:07Z

Thank you. I'm looking for reviewers. Are you on discord?

derekbar90 · 2023-10-04T17:51:30Z

~~Model names don’t always relate to litellm config names. Not sure if using the model name is the perfect variable fit here~~

BeibinLi · 2023-10-04T18:05:00Z

A substitution with "litellm.completion" may not be adequate in this scenario. We may need additional checks. The OpenAI call is still preferred in most cases.

BeibinLi · 2023-10-04T18:29:15Z

Another suggestion is: we should integrate LiteLLM natively with OAI_CONFIG_LIST, so that users don't need to worry about the backend.

ishaan-jaff · 2023-10-04T18:33:46Z

@sonichi I'm on discord, my username is: 'ishaanberri.ai'
If you can't find that, the LiteLLM discord is here: https://discord.com/invite/wuPM9dRgDw /

I can DM you once you join

ishaan-jaff · 2023-10-04T18:35:04Z

Model names don’t always relate to litellm config names. Not sure if using the model name is the perfect variable fit here

@derekbar90 what do you mean by this ?

ishaan-jaff · 2023-10-04T18:37:44Z

Another suggestion is: we should integrate LiteLLM natively with OAI_CONFIG_LIST, so that users don't need to worry about the backend.

@BeibinLi that's a great suggestion I can tackle that in this PR. I can read the OAI_CONFIG_LIST and pass that to the litellm.completion call

inconnu26 · 2023-10-05T12:18:40Z

I just made it work with litellm, and it's great !. But I had to change quite a lot of code from the project...

AaronWard · 2023-10-05T12:32:08Z

I just made it work with litellm, and it's great !. But I had to change quite a lot of code from the project...

@inconnu26 You beat me too it, was just in the middle of making some code changes - can you share the changes you made to this PR? I think @ishaan-jaff will need to give permission to work off his branch.

UweR70 · 2023-10-05T13:57:57Z

You are adding/ importing "litellm" in "completion.py"
but did not adjust the requirement files
(like adding "pip install litellm")?

krrishdholakia · 2023-10-05T16:11:59Z

Hey @inconnu26 is there a best way to communicate? If there's any improvements we can make on our end to reduce code changes - I'd love to help with that.

Alternatively if you have a way I can see the code - that'd be helpful!

krrishdholakia · 2023-10-11T14:49:37Z

@sonichi @BeibinLi would like to make progress on this PR. Can you please approve the required changes to approve this PR.

Action Items

Make litellm an optional dependency
Document that in this section - https://microsoft.github.io/autogen/docs/Installation#optional-dependencies

Next Steps

Upon completion of action items (stated above), this PR will be merged.

sonichi · 2023-10-11T16:27:17Z

@sonichi @BeibinLi would like to make progress on this PR. Can you please approve the required changes to approve this PR.

Action Items

Make litellm an optional dependency

Document that in this section - https://microsoft.github.io/autogen/docs/Installation#optional-dependencies

Next Steps

Upon completion of action items (stated above), this PR will be merged.

The plan looks good to me. Other reviewers' comments still need to be addressed before merging.

krrishdholakia · 2023-10-11T17:08:51Z

@sonichi what are the remaining unaddressed issues?

sonichi · 2023-10-12T03:43:31Z

@sonichi what are the remaining unaddressed issues?

There are unresolved conversations in the reviews.

sonichi · 2023-10-15T15:04:15Z

autogen/oai/completion.py

+                    # print("using cached response")
+                    cls._book_keeping(config, response)
+                    return response
+            except:


Catch a more specific exception.

gagb · 2023-10-17T21:05:16Z

Related: LMStudio works great for local models without changing an AutoGen code. I tried mistral 7b on m1 Mac.

https://www.youtube.com/watch?v=10FCv-gCKug

gagb · 2023-10-24T22:20:06Z

This PR is now 3 weeks old. @ishaan-jaff and @krrishdholakia please let us know when all issues in conversation above have been addressed.

gagb · 2023-10-25T18:51:14Z

It seems that LiteLLM supports Bedrock models?

#339

ChrisDryden · 2023-11-03T18:08:55Z

Is there documentation available on this setup for connecting to AWS Bedrock? Was looking for a "Hello World" style snippet to set up Titan or Claude?

ishaan-jaff · 2023-11-03T18:20:15Z

@ChrisDryden here's how you can use litellm proxy for this
docs: https://docs.litellm.ai/docs/proxy_server

litellm --model bedrock/anthropic.claude-v2

ensure the following vars are in your env for bedrock: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION_NAME

Use proxy with autogen

from autogen import AssistantAgent, UserProxyAgent, oai
config_list=[
    {
        "model": "my-fake-model",
        "api_base": "http://localhost:8000",  #litellm compatible endpoint
        "api_type": "open_ai",
        "api_key": "NULL", # just a placeholder
    }
]

response = oai.Completion.create(config_list=config_list, prompt="Hi")
print(response) # works fine

llm_config={
    "config_list": config_list,
}

assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy")
user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list)

xsa-dev · 2023-11-12T00:00:26Z

How can we boost this PR?
Why hasn't it been completed yet?

This is incredible!

sonichi · 2023-11-12T03:22:23Z

As a result of the openai v1 breaking changes, I expect a new PR is needed to be based on the latest main.

xsa-dev · 2023-11-14T20:43:19Z

As a result of the openai v1 breaking changes, I expect a new PR is needed to be based on the latest main.

@ishaan-jaff
Could you please provide an updated PR with the latest OpenAI endpoints version?

austinmw · 2023-11-23T20:22:53Z

Hi, any updates on this?

[autogen.oai.completion: 11-23 15:22:06] {785} WARNING - Completion.create is deprecated in pyautogen v0.2 and openai>=1. The new openai requires initiating a client for inference. Please refer to https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#api-unification
Traceback (most recent call last):
File "/Users/austinwelch/Desktop/autogen/test_bedrock.py", line 11, in
response = oai.Completion.create(config_list=config_list, prompt="Hi")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/austinwelch/Desktop/autogen/autogen/oai/completion.py", line 791, in create
raise ERROR
AssertionError: (Deprecated) The autogen.Completion class requires openai<1 and diskcache.

ekzhu · 2024-01-14T21:00:58Z

@ishaan-jaff I love this PR. Currently AutoGen already supports litellm proxy, but would you be interested to write a blog post on integration between LiteLLM and AutoGen? There are some youtube videos on this topic: https://www.youtube.com/watch?v=y7wMTwJN7rA

Neo9061 · 2024-05-05T23:34:22Z

@ChrisDryden here's how you can use litellm proxy for this docs: https://docs.litellm.ai/docs/proxy_server

litellm --model bedrock/anthropic.claude-v2

ensure the following vars are in your env for bedrock: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION_NAME

Use proxy with autogen

from autogen import AssistantAgent, UserProxyAgent, oai
config_list=[
    {
        "model": "my-fake-model",
        "api_base": "http://localhost:8000",  #litellm compatible endpoint
        "api_type": "open_ai",
        "api_key": "NULL", # just a placeholder
    }
]

response = oai.Completion.create(config_list=config_list, prompt="Hi")
print(response) # works fine

llm_config={
    "config_list": config_list,
}

assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy")
user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list)

Came across this comment. I am able to use litellm proxy to let autogen working with AWS Bedrock. However, it won't work with RAG based setup (tutorial). It says input parameter functions is not supported by bedrock.

Any thought on any work around?

ekzhu · 2024-05-06T03:57:56Z

I am able to use litellm proxy to let autogen working with AWS Bedrock. However, it won't work with RAG based setup (tutorial). It says input parameter functions is not supported by bedrock.

@Neo9061 thanks for pointing this out. I looked at the notebook and I think you just need to change this:

 for caller in [pm, coder, reviewer]:
        d_retrieve_content = caller.register_for_llm(
            description="retrieve content for code generation and question answering.", api_style="function"
        )(retrieve_content)

to remove api_style="function". So it will be using the tool call API instead. cc @thinkall for awareness

If this works could you create a PR to fix this notebook?

ekzhu · 2024-05-06T03:59:36Z

I am closing this PR as it is stalled. We have LiteLLM support documented here: https://microsoft.github.io/autogen/docs/topics/non-openai-models/local-litellm-ollama

@ishaan-jaff let me know if you have any more questions.

v0 litellm

e966ae8

ishaan-jaff had a problem deploying to openai October 4, 2023 01:11 — with GitHub Actions Failure

ishaan-jaff changed the title ~~Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2 CodeLlama (100+LLMs) - using LiteLLM~~ Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM Oct 4, 2023

This was referenced Oct 4, 2023

Integrate opensource LLMs into autogen #46

Closed

Claude support #45

Closed

AaronWard self-requested a review October 5, 2023 11:08

Merge remote-tracking branch 'upstream/main'

2ca2087

clean up using litellm.completion+passing tests

a7b258f

ishaan-jaff had a problem deploying to openai October 5, 2023 16:47 — with GitHub Actions Failure

litellm as a dep

df09d24

ishaan-jaff had a problem deploying to openai October 5, 2023 16:50 — with GitHub Actions Failure

AaronWard had a problem deploying to openai October 11, 2023 14:37 — with GitHub Actions Failure

Merge branch 'main' into main

19b4ab2

AaronWard had a problem deploying to openai October 15, 2023 11:52 — with GitHub Actions Failure

sonichi reviewed Oct 15, 2023

View reviewed changes

sonichi requested a review from a team October 15, 2023 15:04

Merge branch 'main' into main

f433e33

AaronWard had a problem deploying to openai October 17, 2023 08:18 — with GitHub Actions Failure

gagb mentioned this pull request Oct 25, 2023

[Feature Request] Amazon Bedrock support? #339

Closed

ekzhu closed this May 6, 2024

Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM #95

Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2, CodeLlama, Hugging Face (100+LLMs) - using LiteLLM #95

Conversation

ishaan-jaff commented Oct 4, 2023

Why are these changes needed?

Related issue number

Checks

ishaan-jaff commented Oct 4, 2023

ishaan-jaff commented Oct 4, 2023

ishaan-jaff commented Oct 4, 2023

sonichi commented Oct 4, 2023 • edited Loading

derekbar90 commented Oct 4, 2023 • edited Loading

BeibinLi commented Oct 4, 2023

BeibinLi commented Oct 4, 2023

ishaan-jaff commented Oct 4, 2023

ishaan-jaff commented Oct 4, 2023

ishaan-jaff commented Oct 4, 2023

inconnu26 commented Oct 5, 2023

AaronWard commented Oct 5, 2023

UweR70 commented Oct 5, 2023

krrishdholakia commented Oct 5, 2023

krrishdholakia commented Oct 11, 2023 • edited Loading

Action Items

Next Steps

sonichi commented Oct 11, 2023

Action Items

Next Steps

krrishdholakia commented Oct 11, 2023 • edited Loading

sonichi commented Oct 12, 2023

sonichi Oct 15, 2023

Choose a reason for hiding this comment

gagb commented Oct 17, 2023

gagb commented Oct 24, 2023

gagb commented Oct 25, 2023

ChrisDryden commented Nov 3, 2023

ishaan-jaff commented Nov 3, 2023 • edited Loading

Use proxy with autogen

xsa-dev commented Nov 12, 2023

sonichi commented Nov 12, 2023

xsa-dev commented Nov 14, 2023

austinmw commented Nov 23, 2023

ekzhu commented Jan 14, 2024

Neo9061 commented May 5, 2024

Use proxy with autogen

ekzhu commented May 6, 2024 • edited Loading

ekzhu commented May 6, 2024

sonichi commented Oct 4, 2023 •

edited

Loading

derekbar90 commented Oct 4, 2023 •

edited

Loading

krrishdholakia commented Oct 11, 2023 •

edited

Loading

krrishdholakia commented Oct 11, 2023 •

edited

Loading

ishaan-jaff commented Nov 3, 2023 •

edited

Loading

ekzhu commented May 6, 2024 •

edited

Loading