Multi-subject Open-set Personalization in Video Generation
Tsai-Shien Chen,
Aliaksandr Siarohin,
Willi Menapace,
Yuwei Fang,
Kwot Sin Lee,
Ivan Skorokhodov,
Kfir Aberman,
Jun-Yan Zhu,
Ming-Hsuan Yang,
Sergey Tulyakov
In this paper, we introduce MSRVTT-Personalization, a new benchmark for the task of personalization. It aims at accurate subject fidelity assessment and supports various conditioning modes, including conditioning on face crops, single or multiple arbitrary subjects, and the combination of foreground objects and background.
We include the testing dataset and evaluation protocol in this repository. We show a test sample of MSRVTT-Personalization below:
Ground Truth Video | Personalization Annotations |
-
MSRVTT-Personalization evaluates a model across five metrics:
- Text similarity (Text-S)
- Video similarity (Vid-S)
- Subject similarity (Subj-S)
- Face similarity (Face-S)
- Dynamic degree (Dync-D)
-
Quantitative evaluation:
-
Subject mode of MSRVTT-Personalization (condition on an entire subject image)
Method Text-S Vid-S Subj-S Dync-D ELITE 0.245 0.620 0.359 - VideoBooth 0.222 0.612 0.395 0.448 DreamVideo 0.261 0.611 0.310 0.311 Video Alchemist 0.269 0.732 0.617 0.466 -
Face mode of MSRVTT-Personalization (condition on a face crop image)
Method Text-S Vid-S Face-S Dync-D IP-Adapter 0.251 0.648 0.269 - PhotoMaker 0.278 0.569 0.189 - Magic-Me 0.251 0.602 0.135 0.418 Video Alchemist 0.273 0.687 0.382 0.424
-
-
Qualitative evaluation:
-
Subject mode of MSRVTT-Personalization
ELITE VideoBooth DreamVideo Video Alchemist Ground Truth -
Face mode of MSRVTT-Personalization
IP-Adapter PhotoMaker Magic-Me Video Alchemist Ground Truth
-
To add
If you find this project useful for your research, please cite our paper. 😊
@inproceedings{chen2025videoalchemist,
title = {Multi-subject Open-set Personalization in Video Generation},
author = {Chen, Tsai-Shien and Siarohin, Aliaksandr and Menapace, Willi and Fang, Yuwei and Lee, Kwot Sin and Skorokhodov, Ivan and Aberman, Kfir and Zhu, Jun-Yan and Yang, Ming-Hsuan and Tulyakov, Sergey},
journal = {arXiv preprint arXiv:2501.06187},
year = {2025}
}
Tsai-Shien Chen: [email protected]