|
--- |
|
language: |
|
- "en" |
|
tags: |
|
- video |
|
license: apache-2.0 |
|
pipeline_tag: text-to-video |
|
library_name: diffusers |
|
--- |
|
|
|
<p align="center"> |
|
<img src="assets/logo.jpg" height=30> |
|
</p> |
|
|
|
# FastMochi Model Card |
|
|
|
## Model Details |
|
|
|
<div align="center"> |
|
<table style="margin-left: auto; margin-right: auto; border: none;"> |
|
<tr> |
|
<td> |
|
<img src="assets/mochi-demo.gif" width="640" alt="Mochi Demo"> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td style="text-align:center;"> |
|
Get 8X diffusion boost for Mochi with FastVideo |
|
</td> |
|
</tr> |
|
</table> |
|
</div> |
|
|
|
FastMochi is an accelerated [Mochi](https://huggingface.co/genmo/mochi-1-preview) model. It can sample high quality videos with 8 diffusion steps. That brings around 8X speed up compared to the original Mochu with 64 steps. |
|
|
|
- **Developed by**: [Hao AI Lab](https://hao-ai-lab.github.io/) |
|
- **License**: Apache-2.0 |
|
- **Distilled from**: [Mochi](https://huggingface.co/genmo/mochi-1-preview) |
|
- **Github Repository**: https://github.com/hao-ai-lab/FastVideo |
|
|
|
## Usage |
|
|
|
- Clone [Fastvideo](https://github.com/hao-ai-lab/FastVideo) repository and follow the inference instructions in the README. |
|
- You can also run FastMochi using the official [Mochi repository](https://github.com/Tencent/HunyuanVideo) with the script below and this [compatible weight](https://huggingface.co/FastVideo/FastMochi). |
|
|
|
<details> |
|
<summary>Code</summary> |
|
|
|
```python |
|
from genmo.mochi_preview.pipelines import ( |
|
DecoderModelFactory, |
|
DitModelFactory, |
|
MochiMultiGPUPipeline, |
|
T5ModelFactory, |
|
linear_quadratic_schedule, |
|
) |
|
from genmo.lib.utils import save_video |
|
import os |
|
|
|
with open("prompt.txt", "r") as f: |
|
prompts = [line.rstrip() for line in f] |
|
|
|
pipeline = MochiMultiGPUPipeline( |
|
text_encoder_factory=T5ModelFactory(), |
|
world_size=4, |
|
dit_factory=DitModelFactory( |
|
model_path=f"weights/dit.safetensors", model_dtype="bf16" |
|
), |
|
decoder_factory=DecoderModelFactory( |
|
model_path=f"weights/decoder.safetensors", |
|
), |
|
) |
|
# read prompt line by line from prompt.txt |
|
|
|
|
|
output_dir = "outputs" |
|
os.makedirs(output_dir, exist_ok=True) |
|
for i, prompt in enumerate(prompts): |
|
video = pipeline( |
|
height=480, |
|
width=848, |
|
num_frames=163, |
|
num_inference_steps=8, |
|
sigma_schedule=linear_quadratic_schedule(8, 0.1, 6), |
|
cfg_schedule=[1.5] * 8, |
|
batch_cfg=False, |
|
prompt=prompt, |
|
negative_prompt="", |
|
seed=12345, |
|
)[0] |
|
save_video(video, f"{output_dir}/output_{i}.mp4") |
|
``` |
|
|
|
</details> |
|
|
|
|
|
## Training details |
|
|
|
FastMochi is consistency distillated on the [MixKit](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main) dataset with the following hyperparamters: |
|
- Batch size: 32 |
|
- Resulotion: 480X848 |
|
- Num of frames: 169 |
|
- Train steps: 128 |
|
- GPUs: 16 |
|
- LR: 1e-6 |
|
- Loss: huber |
|
|
|
## Evaluation |
|
We provide some qualitative comparisons between FastMochi 8 step inference v.s. the original Mochi with 8 step inference: |
|
|
|
|
|
| FastMochi 6 steps | Mochi 6 steps | |
|
| --- | --- | |
|
| data:image/s3,"s3://crabby-images/a9888/a9888773fd0e194f1d6cfe27d4339375b5cbd0f4" alt="FastMochi 8 step" | data:image/s3,"s3://crabby-images/3d93d/3d93d8dd15064b3cc01971097dd05a83da13ee80" alt="Mochi 8 step" | |
|
| data:image/s3,"s3://crabby-images/dcb46/dcb46e5bb37390e6f60d4a992f3c8988b9bda276" alt="FastMochi 8 step" | data:image/s3,"s3://crabby-images/2bcab/2bcabc1f74589ed30a5559b889f82dffa15abdd1" alt="Mochi 8 step" | |
|
| data:image/s3,"s3://crabby-images/b6453/b6453258462fd4a5f579296e4d9bfc9b86ef917d" alt="FastMochi 8 step" | data:image/s3,"s3://crabby-images/4a4cd/4a4cde7f13a404aa626ddd4df9b00a66db644da8" alt="Mochi 8 step" | |
|
| data:image/s3,"s3://crabby-images/e9910/e9910b053ab6be6080337b4ffa3b36862e98eabf" alt="FastMochi 8 step" | data:image/s3,"s3://crabby-images/0811d/0811d6730a105abf54494d838f59c55a83781695" alt="Mochi 8 step" | |
|
|
|
|