Skip to content

Franky03/ZoomAI

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ZoomAI

Image Generation and Manipulation Pipeline

Imagem do WhatsApp de 2024-06-17 à(s) 22 26 10_03e5fdfe

Overview

This project allows users to generate and manipulate images based on textual and visual inputs. The pipeline leverages models such as ChatGPT for prompt generation and Stable Diffusion for image inpainting and generation. The core steps involve taking user inputs, generating prompts, and producing final images.

Components

  1. User Input: The pipeline accepts two types of inputs from users:
  • Text Input: User-provided text describing the desired image.
  • Image Input: User-provided image to be used as a reference.
  1. Image to Text Model: Uses Salesforce's blip-image-captioning-large model to convert image inputs into text descriptions.

  2. ChatGPT API: Utilizes OpenAI's gpt-3.5-turbo-0125 to process user prompts and generate detailed prompts for the image generation model.

  3. Database: Stores project information and image data, including user prompts, generated prompts, and images.

  4. Stable Diffusion Model: Uses StabilityAI's stable-diffusion-2-inpainting for generating and inpainting images based on the processed prompts.

  5. Producer and Consumer:

  • Producer: Handles the image generation and manipulation process.
  • Consumer: Delivers the final images to the user.

Setup

git clone https://github.com/TailUFPB/ZoomAI
cd ZoomAI
pip install -r requirements.txt

Languages

  • CSS 50.6%
  • JavaScript 27.3%
  • Python 17.5%
  • Shell 3.0%
  • PowerShell 1.1%
  • HTML 0.5%