Mini-project-YouTube-comment-generator-improved-with-RAG

To improve upon the specific domain knowledge which was not provided by QLoRA, we use RAG. This specifically extracts knowledge from a knowledge base and gives better, more technical results

Introduction

The YouTube Comment Generator project builds on the base capabilities provided by QLoRA. To overcome limitations in specific domain knowledge, we utilize RAG to extract relevant information from a knowledge base, resulting in more accurate and technical responses.

Features

Base Model: Utilizes the Mistral-7b-instruct model. Fine-Tuning: Enhanced with QLoRA for optimized performance. RAG Implementation: Integrates RAG to retrieve and generate contextually rich responses. Open Source Libraries: Employs libraries and datasets from open-source platforms such as Hugging Face.

Data Preparation

The project uses several PDF articles to build a knowledge base. These articles are moved to a designated directory and processed for use in the RAG model.

Model Training

The Mistral-7b-instruct model is fine-tuned using QLoRA and further enhanced with RAG to improve domain-specific knowledge. Key steps include:

Data Collection: Loading and processing PDF documents to create a knowledge base. Vector Storage: Storing processed documents into a vector database for efficient retrieval. Model Configuration: Setting up the model with appropriate configurations for quantization and fine-tuning. Training: Fine-tuning the model with the provided dataset and configurations. Usage

The model generates replies to YouTube comments by extracting relevant information from the knowledge base and integrating it into the response. This ensures that responses are contextually accurate and technically sound.

Example Query

An example query demonstrates how the model responds to a comment about "fat-tailedness," providing a detailed and accurate explanation using the context retrieved from the knowledge base.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
4 Ways to Quantify Fat Tails with Python _ by Shaw Talebi _ Towards Data Science.pdf		4 Ways to Quantify Fat Tails with Python _ by Shaw Talebi _ Towards Data Science.pdf
Detecting Power Laws in Real-world Data with Python _ by Shawhin Talebi _ Towards Data Science.pdf		Detecting Power Laws in Real-world Data with Python _ by Shawhin Talebi _ Towards Data Science.pdf
Improved_YouTube_commenter_by_RAG.ipynb		Improved_YouTube_commenter_by_RAG.ipynb
PDFs		PDFs
Pareto, Power Laws, and Fat Tails _ by Shaw Talebi _ Towards Data Science _ Towards Data Science.pdf		Pareto, Power Laws, and Fat Tails _ by Shaw Talebi _ Towards Data Science _ Towards Data Science.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini-project-YouTube-comment-generator-improved-with-RAG

About

Releases

Packages

Languages

MAvRK7/Mini-project-YouTube-comment-generator-improved-with-RAG

Folders and files

Latest commit

History

Repository files navigation

Mini-project-YouTube-comment-generator-improved-with-RAG

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages