FastMochi Model Card

Model Details

Mochi Demo
Get 8X diffusion boost for Mochi with FastVideo

FastMochi is an accelerated Mochi model. It can sample high quality videos with 8 diffusion steps. That brings around 8X speed up compared to the original Mochu with 64 steps.

Usage

Code
from genmo.mochi_preview.pipelines import (
    DecoderModelFactory,
    DitModelFactory,
    MochiMultiGPUPipeline,
    T5ModelFactory,
    linear_quadratic_schedule,
)
from genmo.lib.utils import save_video
import os

with open("prompt.txt", "r") as f:
    prompts = [line.rstrip() for line in f]
    
pipeline = MochiMultiGPUPipeline(
    text_encoder_factory=T5ModelFactory(),
    world_size=4,
    dit_factory=DitModelFactory(
        model_path=f"weights/dit.safetensors", model_dtype="bf16"
    ),
    decoder_factory=DecoderModelFactory(
        model_path=f"weights/decoder.safetensors",
    ),
)
# read prompt line by line from prompt.txt


output_dir = "outputs"
os.makedirs(output_dir, exist_ok=True)
for i, prompt in enumerate(prompts):
    video = pipeline(
        height=480,
        width=848,
        num_frames=163,
        num_inference_steps=8,
        sigma_schedule=linear_quadratic_schedule(8, 0.1, 6),
        cfg_schedule=[1.5] * 8,
        batch_cfg=False,
        prompt=prompt,
        negative_prompt="",
        seed=12345,
    )[0]
    save_video(video, f"{output_dir}/output_{i}.mp4")

Training details

FastMochi is consistency distillated on the MixKit dataset with the following hyperparamters:

  • Batch size: 32
  • Resulotion: 480X848
  • Num of frames: 169
  • Train steps: 128
  • GPUs: 16
  • LR: 1e-6
  • Loss: huber

Evaluation

We provide some qualitative comparisons between FastMochi 8 step inference v.s. the original Mochi with 8 step inference:

FastMochi 6 steps Mochi 6 steps
FastMochi 8 step Mochi 8 step
FastMochi 8 step Mochi 8 step
FastMochi 8 step Mochi 8 step
FastMochi 8 step Mochi 8 step
Downloads last month
110
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-to-video models for diffusers library.