LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

video_LeviTor.mp4

Hanlin Wang^1,2, Hao Ouyang², Qiuyu Wang², Wen Wang^3,2, Ka Leong Cheng^4,2, Qifeng Chen⁴, Yujun Shen², Limin Wang^†,1
¹State Key Laboratory for Novel Software Technology, Nanjing University
²Ant Group ³Zhejiang University ⁴The Hong Kong University of Science and Technology ^{^†corresponding author}

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis
- TODO List
- Update Log
- Setup
- Acknowledgement
- Note

TODO List

Release gradio demo on huggingface.

Update Log

[2024.12.20] 🎉 Exciting News: Interactive demo with gradio for LeviTor has been released!

Setup

Follow the following guide to set up the environment.

Git clone repo

git clone https://github.com/qiuyu96/LeviTor.git
cd LeviTor

Download and unzip checkpoints
Creat checkpoints dir:
```
 mkdir checkpoints
 cd checkpoints
```
Download 'depth_anything_v2_vitl.pth' from Depth Anything V2
Download 'sam_vit_h_4b8939.pth' from Segment Anything
Download 'stable-video-diffusion-img2vid-xt' from stabilityai
Create LeviTor checkpoint directory:
```
mkdir LeviTor
cd LeviTor
```
Then download LeviTor checkpoint from LeviTor

Ensure all the checkpoints are in the checkpoints directory as:
```
checkpoints/
 |-- sam_vit_h_4b8939.pth
 |-- depth_anything_v2_vitl.pth
 |-- stable-video-diffusion-img2vid-xt/
 |-- LeviTor/
     |-- random_states_0.pkl
     |-- scaler.pt
     |-- scheduler.bin
     |-- controlnet/
     |-- unet/
```

Create environment

conda create -n LeviTor python=3.9 -y
conda activate LeviTor

Install packages
```
pip install -r requirements.txt
```

Install pytorch3d

pip install "git+https://github.com/facebookresearch/pytorch3d.git"

Install gradio
```
pip install gradio==4.36.1
```

Run LeviTor

python gradio_demo/gradio_run.py --frame_interval 1 --num_frames 16 --pretrained_model_name_or_path checkpoints/stable-video-diffusion-img2vid-xt --resume_from_checkpoint checkpoints/LeviTor --width 288 --height 512 --seed 217113 --mixed_precision fp16 --enable_xformers_memory_efficient_attention --output_dir ./outputs --gaussian_r 10 --sam_path checkpoints/sam_vit_h_4b8939.pth --depthanything_path checkpoints/depth_anything_v2_vitl.pth

Tutorial

Please read before you try!

I. Upload the start image

Use the Upload Start Image to upload your image~

II. Select the area you want to operate

Click Select Area with SAM button and then click on the image to select the area you want to operate with SAM.
Note that if the current point you click on the image is in your interested area, input 1 in the Add SAM Point? box. Otherwise, input 0 to click a point in your uninterested area. Use the Add SAM Point? box to accurately selected the area you want.

III. Draw a 3D trajectory and run!

Click the Add New Drag Trajectory button to draw a 2D trajectory by clicking a series of points. Then you can see the depth values of your clicked points in the Depths for reference box. You can refer to this to determine your later input depth values.

Then in the Input depth values here box, input your depth control values. All the depth values should be normalized to the range of 0 to 1. The smaller value is, the nearer the area moves. The number of input depth values input should be the same with values in Depths for reference box. Note that the depth values you input is relative depth, so the depths do not matter, but the changes of depth values determine how near or how far the object moves.

After that, input a ratio value in the Input number of points for inference here to select how many control points will be used for inference. A small ratio is beneficial for non-rigid body motion, while a large ratio is suitable for rigid body motion.

Finally, click the Run button to generate you video!

IV. Others

You can also draw multiple trajectories to generate your video.

To generate some orbiting effects, one suggestion is to select a reference object and create a primarily stationary path for it. This allows you to set a relative depth value, making it easier to adjust the depth changes for objects orbiting around it. For example, in the image above, we set the mountain depth value to 0.1. Then, based on the position of the points, we adjust the planetary depth variation. A value greater than 0.1 indicates that the object is behind the mountain, while a value less than 0.1 indicates that it has moved in front of the mountain, achieving a surrounding effect.

Citation

Don't forget to cite this source if it proves useful in your research!

@article{wang2024levitor, 
	title={LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis}, 
	author={Hanlin Wang and Hao Ouyang and Qiuyu Wang and Wen Wang and Ka Leong Cheng and Qifeng Chen and Yujun Shen and Limin Wang}, 
	year={2024}, 
	eprint={2412.15214}, 
	archivePrefix={arXiv}, 
	primaryClass={cs.CV}}

Acknowledgement

Our implementation is based on

Thanks for their remarkable contribution and released code!

Note

Note: This repo is governed by the license of Apache 2.0 We strongly advise users not to knowingly generate or allow others to knowingly generate harmful content, including hate speech, violence, pornography, deception, etc.

(注：本仓库受Apache 2.0的许可协议限制。我们强烈建议，用户不应传播及不应允许他人传播以下内容，包括但不限于仇恨言论、暴力、色情、欺诈相关的有害信息。)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
cotracker		cotracker
gradio_demo		gradio_demo
models_diffusers		models_diffusers
pics		pics
pipelines		pipelines
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

TODO List

Update Log

Setup

Tutorial

I. Upload the start image

II. Select the area you want to operate

III. Draw a 3D trajectory and run!

IV. Others

Citation

Acknowledgement

Note

About

Releases

Packages

Contributors 3

Languages

ant-research/LeviTor

Folders and files

Latest commit

History

Repository files navigation

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

TODO List

Update Log

Setup

Tutorial

I. Upload the start image

II. Select the area you want to operate

III. Draw a 3D trajectory and run!

IV. Others

Citation

Acknowledgement

Note

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages