AI Vision

This is a shared repository between Rissalat Ahmed and Saniya Nafees. Our goal here is to create a fullstack web application with payment integration.

Our Idea

more stuff to come

User uploads the image
We generate an image caption, which becomes an input to GPT2
GPT2 generates text based on the input received from the image caption

Dependancies

Python 3.6
Torch
Torchvision
Scipy Version 1.1.0

Pretrained Model and Word Map

From Sagar Vinodbabu's repository I found this great pretrained model with wordmap

Getting Started

Ideally you would create this environment in a fresh anaconda virtual environment

Install the dependancies, pretrained model and word map
Clone the repo

git clone https://github.com/saniyanafees6/ai-vision.git

Open Terminal in the folder and enter the following command:

python -W ignore caption.py --img='/path/to/image.jpg' --model='path/to/BEST_checkpoint_coco_5_cap_per_img_5_min_word_freq.pth.tar' --word_map='path/to/WORDMAP_coco_5_cap_per_img_5_min_word_freq.json' --beam_size=5

more steps to come

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
LICENSE		LICENSE
README.md		README.md
caption.py		caption.py
create_input_files.py		create_input_files.py
datasets.py		datasets.py
eval.py		eval.py
models.py		models.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Vision

Our Idea

Dependancies

Pretrained Model and Word Map

Getting Started

Credits

Github Repositories

Research Papers

About

Releases

Packages

Languages

License

kukuhaza/ai-vision

Folders and files

Latest commit

History

Repository files navigation

AI Vision

Our Idea

Dependancies

Pretrained Model and Word Map

Getting Started

Credits

Github Repositories

Research Papers

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages