Wan-AI
/

Wan2.1-T2V-1.3B-Diffusers

video-generation

Model card Files Files and versions Community

a-r-r-o-w HF staff commited on 8 days ago

Commit

b066818

·

verified ·

1 Parent(s): 5af5c01

Update README.md

Files changed (1) hide show

README.md +28 -2

README.md CHANGED Viewed

@@ -54,13 +54,13 @@ This repository hosts our T2V-1.3B model, a versatile solution for video generat
     - [x] Multi-GPU Inference code of the 14B and 1.3B models
     - [x] Checkpoints of the 14B and 1.3B models
     - [x] Gradio demo
-    - [ ] Diffusers integration
     - [ ] ComfyUI integration
 - Wan2.1 Image-to-Video
     - [x] Multi-GPU Inference code of the 14B model
     - [x] Checkpoints of the 14B model
     - [x] Gradio demo
-    - [ ] Diffusers integration
     - [ ] ComfyUI integration
@@ -163,6 +163,32 @@ pip install "xfuser>=0.4.1"
 torchrun --nproc_per_node=8 generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --dit_fsdp --t5_fsdp --ulysses_size 8 --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
 ```
 ##### (2) Using Prompt Extention

     - [x] Multi-GPU Inference code of the 14B and 1.3B models
     - [x] Checkpoints of the 14B and 1.3B models
     - [x] Gradio demo
+    - [x] Diffusers integration
     - [ ] ComfyUI integration
 - Wan2.1 Image-to-Video
     - [x] Multi-GPU Inference code of the 14B model
     - [x] Checkpoints of the 14B model
     - [x] Gradio demo
+    - [x] Diffusers integration
     - [ ] ComfyUI integration
 torchrun --nproc_per_node=8 generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --dit_fsdp --t5_fsdp --ulysses_size 8 --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
 ```
+Wan can also be run directly using 🤗 Diffusers!
+```python
+import torch
+from diffusers import AutoencoderKLWan, WanPipeline
+from diffusers.utils import export_to_video
+# Available models: Wan-AI/Wan2.1-T2V-14B-Diffusers, Wan-AI/Wan2.1-T2V-1.3B-Diffusers
+model_id = "Wan-AI/Wan2.1-T2V-1.3B-Diffusers"
+vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
+pipe = WanPipeline.from_pretrained(model_id, vae=vae, torch_dtype=torch.bfloat16)
+pipe.to("cuda")
+prompt = "A cat walks on the grass, realistic"
+negative_prompt = "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards"
+output = pipe(
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    height=480,
+    width=832,
+    num_frames=81,
+    guidance_scale=5.0
+).frames[0]
+export_to_video(output, "output.mp4", fps=15)
+```
 ##### (2) Using Prompt Extention