I2VGEN-XL is a Cascaded Diffussion model designed for generating videos from image and text inputs. It employs a cascaded architecture, leveraging diffusion processes to produce high-quality and coherent video outputs. This repository provides an easy-to-use Jupyter Notebook to get started with I2VGEN-XL.
-
Multimodal Inputs: Combines images and text to create dynamic video outputs.
-
Cascaded Architecture: Utilizes a hierarchical approach to ensure progressively refined video quality.
-
Lightweight Setup: Run directly in Jupyter Notebook with minimal setup requirements.
- A Kaggle account for running the notebook on Kaggle’s GPU-enabled environment.
- Basic familiarity with Python and Jupyter Notebooks.
-
Download the Notebook:
- Clone the repository or download the
IVGEN-XL.ipynb
file directly.git clone https://github.com/MOSTAFA1172m/Image-text-video-IVGENXL.git cd IVGEN-XL
- Clone the repository or download the
-
Upload the Notebook to Kaggle:
- Go to Kaggle.
- Create a new notebook and upload the
IVGEN-XL.ipynb
file.
-
Enable GPU:
- In your Kaggle notebook, navigate to Settings > Accelerator and select GPU.
-
Install Dependencies:
- Run the following command in a code cell:
!pip install -r requirements.txt
- Run the following command in a code cell:
-
Run the Notebook:
- Follow the step-by-step instructions provided in the notebook to input images and text, and generate videos.
Here are some example outputs generated by the model. The results showcase how the model processes various inputs:
Example 1 | Example 2 |
---|---|
Description: Newton smiling and waving. | Description: Monalisa laughing. |
Example 3 | Example 4 |
---|---|
Description: Car driving on the road. | Description: Sunset over the sea. |
Example 5 |
---|
Description: Clouds moving across the sky over the mountains. |
On Kaggle, you can use GPU for faster results.
Feel free to experiment with different inputs and see how the model generates videos in response.
This project is licensed under the MIT License. See the LICENSE file for details.
For any questions or feedback, feel free to reach out to my linkedin.
Note: Ensure the GPU runtime is enabled before running the notebook to avoid performance issues.