The idea is generate code with the assistance of guidance library, using open source LLM models that run locally. This library is exposed as a VSCode plugin, and adds code-generation commands on editor selection (invoked through right-click or command palette).
NOTE: main is currently unstable, developing the use of guidance prompts (see guidance library: https://github.com/microsoft/guidance)
WARNING: Only add docstring to functions command is somewhat stable at the moment.
Update: 03.06.2023 The guidance server code has been moved to a separate repository: https://github.com/ChuloAI/andromeda-chain
If you're looking for the code version from Medium Article, try checking out v0.1.5 (more in next section).
Starting this service now happens through Docker, and should be a lot easier.
Requirements:
- docker-engine
- docker-compose v2
If using GPU also:
- nvidia-docker: https://github.com/NVIDIA/nvidia-docker
git clone https://github.com/paolorechia/oasis
cd oasis
mkdir models
cd models
git clone https://huggingface.co/Salesforce/codegen-350M-mono
cd ..
docker-compose -f docker-compose.cpu.yaml up
If you change the used model, make sure to update the injected environment variable MODEL_PATH passed to the guidance server container:
Line 42 in 4cd7da6
Change the command to use the other docker-compose file:
docker-compose -f docker-compose.gpu.yaml up
If you want to use text-generation-webui with simpler prompts, use v0.1.3. This is a deprecated feature, newer versions will no longer support text-generation-webui
, at least for the time being.
- Install text-generation-web-ui, start it with API: https://github.com/oobabooga/text-generation-webui
git clone https://github.com/paolorechia/oasis
cd oasis
git checkout v0.1.3
- Start the FastAPI server in
prompt_server
:
cd prompt_server
pip install -r requirements.txt
./start_uvicorn.sh
git clone https://github.com/paolorechia/oasis
cd oasis
git checkout v0.1.3
By default, it will install PyTorch to run with CPU.
- Start the FastAPI server in
guidance_server
:
cd guidance_server
pip install -r requirements.txt
./start_uvicorn.sh
This server is quite heavy on dependencies.
- Start the FastAPI server in
prompt_server
:
cd prompt_server
pip install -r requirements.txt
./start_uvicorn.sh
- Install VSCode plugin called 'oasis-llamas'
- Use it!
There's no automated installation to set it up with GPU - when setting up the guidance_server
above, I recommend the following steps for an NVIDIA card.
- Remove torch from requirements.txt
- If needed, install NVIDIA Developer Toolkit: https://developer.nvidia.com/cuda-11-8-0-download-archive
- Install PyTorch following the official documentation instead: https://pytorch.org/get-started/locally/
- Install the remaining dependencies in the requirements.txt
You also need to modify the source code and change this line to
use_gpu_4bit = False
use_gpu_4bit = True
In guidance_server/main.py
(currently line 41)
How does it work?
The Oasis backend receives a pair of command/selected code from the VSCode extension frontend, and uses this input to:
- Parse the input code using the
ast
module (https://docs.python.org/3/library/ast.html) - Find specific code parts from the parsed code
- Choose a guidance prompt to apply
- Applies the guidance prompt, delegating the LLM call to the second backend service:
guidance server
- Parses the result and forms an adequate response back to the frontend.
You can read more about it in Medium: https://medium.com/@paolorechia/building-oasis-a-local-code-generator-tool-using-open-source-models-and-microsofts-guidance-aef54c3e2840
There is currently no exposed config. If you want to change the loaded model, change the source code in
guidance_server/main.py
, in lines 35-39 you will find something like:
# model = "TheBloke/wizardLM-7B-HF"
model = "Salesforce/codegen-350m-mono"
# model = "Salesforce/codegen-2b-mono"
# model = "Salesforce/codegen-6b-mono"
# model = "Salesforce/codegen-16B-mono"
Uncomment the one you'd like to use this with.
This plugin works even with the 350m-mono model version! That's currently only possible with something like the guidance library. Although do expect better results with bigger models.
Note: for better results, select exactly ONE block of function to add docstring too, with NO syntax errors.
- The plugin currently removes the extra new lines in the function definition. This is a problem related to the usage of the
ast.parse
function, which strips newlines. This is used to decompose the function header from the body, and inject the docstring generation. - The plugin sometimes messes up the indentation of the generated docstring/input code.