T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback

Model description 🚀

This repository contains unet_lora.pt that can turn ModelScopeT2V into our T2V-Turbo (MS). Our T2V-Turbo (MS) can achieve both fast and high-quality T2V generation. On VBench, the 4-step generation from our T2V-Turbo (MS) even outperform proprietary systems, including Gen-2 and Pika. Please refer to our GitHub repo for detailed instructions.

4-step Text-to-video Generation


a dog wearing vr goggles on a boat	Mickey Mouse is dancing on white background	close-up shot, high detailed, A boy with a baseball cap, freckles, and a playful grin.


an old man with a long grey beard and green eyes, camera rotate anticlockwise	The flowing water sparkled under the golden sunrise in a peaceful mountain river.	close-up shot, high detailed, a girl with long curly blonde hair and sunglasses

Misuse, Malicious Use and Excessive Use 📖

Our model is meant for research purposes.

It is prohibited to generate content that is demeaning or harmful to people or their environment, culture, religion, etc.
Prohibited for pornographic, violent and bloody content generation.
Prohibited for error and false information generation.

jiachenli-ucsb
/

T2V-Turbo-MS

T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback

Model description 🚀

4-step Text-to-video Generation

Misuse, Malicious Use and Excessive Use 📖

Space using jiachenli-ucsb/T2V-Turbo-MS 1

Collection including jiachenli-ucsb/T2V-Turbo-MS

T2V-Turbo