-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local Models integration #27
Comments
+1 to this question It makes no sense to shovel money into some closed source while we have a powerful GPU that can run 13b Llama with no problem with some of the other open source projects. |
I'd also be very eager to use local models with ChatDev, Llama based models show great promise |
Local model use and perhaps as a more advanced feature, assign different models to different agents in the company - so could use a local python-optimized model for an engineer, and a llama2 model for the CEO, etc. |
@j-loquat I love that idea. That is a thing i was considering more and more. Ai becoming more and more like greek gods, each with its characther and function that complete each other.. it was the original vision of Altman too, kind of, but they lost their way |
No need to have 1 "god AGI" (which can not be ran locally as it demands crazy hardware) if we can have 20 agents with 20 different local narrow AI models that can be loaded one after another. |
Oh god, sorry Devs but this conversation is too interesting. You may need to turn notifications off XD I was trained as an artist, and the first thing to know is that limitations are the generator of creativity. A big Ai with all the knowledge of the world may just become the most boring thing to touch the planet. And this may be controversial, but I think that bad qualities are needed too...everything has its meaning and use in order to create balance. Just my opinion |
This has been referenced in #33 |
OPENAI_API_BASE=http://127.0.0.1:8000/v1 OPENAI_API_KEY="dummy" python run.py --task "Snake game in pure html" --name "WebSnake" |
The command above did not work in Anaconda Prompt, but this version did:
I am having a problem using it with local api: It looks like all that the API returns is 1 token: Text-generation-webui side:
ChatDev side
|
Yeah, the command above was for macOS, no troubles with conda environment here. @andraz, why don't you increase the context to 4K or 8K tokens? Based on your model name it support context up to 8K tokens.
As for one token response I guess it's streaming feature, so you don't need to wait for a full response. |
Hello there, I created a new yaml file with name I can successfully run it and receive answers to my questions as part of the returning object via When then I try to run chatdev on a simple task ie
After 3 retries it crashes with the following
I have already exported What can I do to successfully use the local LLM? Thanks for any help and sorry if this is the wrong place to ask it! |
What model are you using? |
@jacktang, it depends on the model but for example looks like - gpt-3.5-turbo-16k.txt (rename to @GitSimply, those are working with many of the GPT tools on my setup: WizardLM, WizardCoder, WizardCoderPy, Wizard-Vicuna, Vicuna, CodeLLaMa. |
I'm just going to add a bit about how I got ChatDev running locally with LM Studio server for anyone searching. It was really easy if there would have been clear instructions but I had to read through all of the issues and tried to find stuff in the code to no avail. Anyway. The basics:
On step 4 do this instead:
And that's it (you'll need to start the LMS server and load a model), now you can just run ChatDev like you normally would but locally. |
Hey @starkdmi, while using LocalAI git clone https://github.com/go-skynet/LocalAI After doing this in LocalAI, I am directly executing this in ChatDev and I am getting the following error: Traceback (most recent call last): The above exception was the direct cause of the following exception: Traceback (most recent call last): how can I fix this? |
@sankalp-25, the problem is the local open-ai server which wrongly responds. Do you have a config file for your model in the It should look like that one so it simulates gtp-3.5 model instead of hosting your-model. LocalAI on startup will list the models hosted and you should see the correct name (gpt-3.5/4). |
Hey @starkdmi, I have renamed the .yaml file to gpt-3.5-turbo-16k.yaml and the model file to gpt-3.5-turbo-16k-0613, after which I am doing as follows, and if I am not wrong config file is .yaml which I have renamed from docker-compose.yaml to gpt-3.5-turbo-16k.yaml. If I am wrong, please let me know what is the mistake. Please check the below log
after this I am trying to run the following in chatdev
The error I am getting was given in previous comment Thank you |
@sankalp-25, we could test the model is working using this Python code: import openai # https://github.com/openai/openai-python#installation
openai.api_key = "sk-dummy"
openai.api_base = "http://127.0.0.1:8000/v1"
chat_completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo-16k-0613",
messages=[{"role": "user", "content": "Calculate 20 minus 5."}]
)
completion = chat_completion.choices[0].message.content
print(completion) # The result of 20 minus 5 is 15. |
@starkdmi, what is it when you say config file? Traceback (most recent call last): |
@sankalp-25, wow, the The correct content of the file named name: gpt-3.5-turbo-16k # or gpt-3.5-turbo-16k-0613
parameters:
model: vicuna-13b-v1.5-16k.Q5_K_M.gguf
temperature: 0.2
top_k: 80
top_p: 0.7
max_tokens: 2048
f16: true
context_size: 16384
template:
chat: vicuna
f16: true
gpu_layers: 32
mmap: true |
Correct. You also need one more step. |
I tried to use LM Studio as a local OpenAI substitute. It works good, by utilizing the here suggested setup of environment variables. OPENAI_API_BASE=http://127.0.0.1:1234/v1 OPENAI_API_KEY="xyz" python run.py --task "A drawing app" --name "Draw App" However, it doesn't run through and terminates with an error that the max tokens are exceeded:
For inference I'm using the zephyr-7B-beta. Does anyone know how to fix this or what to do? |
My first though is that this is a max token problem |
Obviously it is. Is it because of the model? How to raise the max tokens? |
Looks like there's an open PR to add this - #53 |
yes using litellm openai-proxy, like that:
A proxy server openai-compatible api will run and redirect to ollama,
|
have you fix the error? I met same err when I use chatglm3-6b as llm server. And server got some red color logs at ""POST /send_message HTTP/1.1" 404 Not Found" . So I think the code got err because llm server did not respond to /send_message correctly. And the code will try again until max_time. |
This should be added to the wiki or documented somewhere |
can someone guide me how to run on full docker stack |
To save the base and/or key in the conda environment use this before activate it (or unactive then re active again)
|
我尝试了你的提议,端口上不出意外是2w,应该不是2k(可能是打错了),我用的也是chatglm3-6b-32k,知识库是BAAI/bge-large-zh,能跑,但奇怪的是响应很慢,不是没有响应,而是过好一会才响应,gpu的80g内存够的,最后花了93mins完成一款无法运行的“Snake game in pure html” |
For me it wasn't OPENAI_API_BASE, but BASE_URL. |
FYI - its not OPEN_API_BASE. if using anaconda on windows you do SET BASE_URL="http://localhost:1234/v1" and then SET OPEN_API_KEY="not needed" . this is if you're using LMstudio. All working my end using Mistral instruct 7B. |
If anyone has ollama integrated with this then please let me know. thanks a lot. happy coding. |
how to go about using other services that like Together.ai offers an OpenAI compatible API, how to set host ? |
If the API is OpenAI compatible you can point at the API endpoint using --api_base as with local models. |
Trying to get this running on a Win10 machine. .conda\envs\ChatDev_conda_env\lib\site-packages\openai_base_client.py", line 877, in _request |
I tried with BASE_URL as well as OPEN_AI_BASE but getting an APIConnectionError tenacity.RetryError: RetryError[<Future at 0x1cdcf0bce80 state=finished raised APIConnectionError>] Can you help? |
It's not BASE_URL, it's OPENAI_BASE_URL! But I also saw on begin that OPENAI_API_BASE is also used (maybe I'm wrong here but it works for me), so the correct command is |
hi all the contributors , |
@acbp has anyone wrote documentation specifically for things like Ollama (or LocalAI, Xorbit, or OpenLLM) to work with ChatDev? LiteLLM maybe? |
If anyone has achieved or can suggest for this . In this we want to continuously index the updated code base in the HippoRAG index and query on the updated index and then make code changes and continuously do so. Here consider that we want to do this offline with ollama type models only and dont want to use OpenAI or Claude . If anyone can suggest how can i do this ? |
anyone had success with llama.cpp? I got and
|
On Mac( m3) i used this with local install ( Anaconda) BASE_URL=http://localhost:1234/v1 OPENAI_API_KEY=dummy python3 run.py --task "< task >" --name "<title>" |
Hey Devs,
let me start by saying that this programme is great. Well done on your work, and thanks for sharing it.
My question is: is there any plan to allow for the integration of local models?
Even a just section in the documentation would be great.
Have a good day
theWW
The text was updated successfully, but these errors were encountered: