Skip to content

Annotated AST for the purposes of LLM code repository context. Use this novel approach to pre-process your Github repository. This project will generate an AST.json file which will contextually represent the repo for you to share as a single file to LLM directly or Vector Store

Notifications You must be signed in to change notification settings

cameronking4/Annotated-AST-For-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Greptile Clone: An Annotated AST for the purposes of providing LLM full context of code repository in single file

Achieve the functionality of Greptile by supplying ChatGPT with a generated AST file that represents your Github Repo for context. Use in a single chat or within a GPT or assistant, Vector store etc. Note the different branches.

Try POC

Use main branch if you're looking to generate an AI annotated AST for a local repository. Switch to from-github branch if you want to process an array of github repositories.

yarn install

node index.js

If you're looking for an API or something to use in production, here is the latest implementation you can deploy to DigitalOcean or any other CSP that supports long standing operations: GreptileClone

Overview

This document outlines the structure and script to generate a custom Abstract Syntax Tree (AST) for a given JavaScript / Typescript repository - for the purposes of sharing repo context with LLM. This custom AST provides a detailed representation of the files, their dependencies, and metadata, which is crucial for analysis and manipulation of the codebase for the purposes of Langchain Agent / OpenAI assistant vector memory. This is a scalable approach for sharing repo knowledge for an LLM as a single JSON file to light up "chat with Github" scenarios quickly.

Novel AST Structure

Using this method will structure your repo in a nested JSON format, where each node represents a file or a module with its specific properties:

  • file: The path to the file or module.
  • type: The type of the file (e.g., JSON, JavaScript).
  • ast: A recursive breakdown of the file’s contents, including metadata such as versions, dependencies, and other relevant details.
  • summary: An AI generated summary of the file, the annotation!
  • sourceCode: The source code itself for the file. (Optional)

Usage

Analysis

This can be uses a pre-processing step for exposing your source code and giving an LLM the full context of how your repo works. Learn more

Tool Integration

The AST can be integrated with tools for:

  • Static analysis: Analyze code quality, security vulnerabilities, and coding standards compliance.
  • Dependency management: Tools that automate dependency upgrades, ensuring that all dependencies are up to date and secure.
  • Custom scripts: Write scripts that traverse the AST to automate specific tasks such as refactoring or identifying unused code.

About

Annotated AST for the purposes of LLM code repository context. Use this novel approach to pre-process your Github repository. This project will generate an AST.json file which will contextually represent the repo for you to share as a single file to LLM directly or Vector Store

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published