TheDenk commited on
Commit
ead0fb4
·
verified ·
1 Parent(s): 286f4f5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -58
README.md CHANGED
@@ -1,58 +1,60 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- tags:
6
- - video
7
- - genmo
8
- - diffusers
9
- pipeline_tag: text-to-video
10
- library_name: diffusers
11
- ---
12
- # 🎥 Distilled Mochi Transformer
13
- Current repository contains distilled transformer for genmoai mochi-1.
14
- This transformer consists of 42 blocks vs 48 blocks in original transformer.
15
-
16
- ### Training details
17
- We have analized MSE of latent after each block and iteratively dropped blocks which have minimum value of MSE.
18
-
19
- After each block drop we have trained neighboring blocks (one before and one after deleted block) for 1K steps.
20
-
21
- ### 🚀 Try it here: [Interactive Demo](https://nim.video/create/2855fa68-21b1-4114-b366-53e5e4705ebf?workflow=image2video)
22
-
23
- ---
24
-
25
-
26
- ## Usage
27
- #### Minimal code example
28
- ```python
29
- import torch
30
- from diffusers import MochiPipeline, MochiTransformer3DModel
31
- from diffusers.utils import export_to_video
32
-
33
- transformer = MochiTransformer3DModel.from_pretrained(
34
- "NimVideo/mochi-1-transformer-42",
35
- torch_dtype=torch.bfloat16,
36
- )
37
- pipe = MochiPipeline.from_pretrained(
38
- "genmo/mochi-1-preview",
39
- transformer=transformer,
40
- variant="bf16",
41
- torch_dtype=torch.bfloat16
42
- )
43
-
44
- pipe.enable_model_cpu_offload()
45
- pipe.enable_vae_tiling()
46
-
47
- prompt = "Close-up of a chameleon's eye, with its scaly skin changing color. Ultra high resolution 4k."
48
- frames = pipe(prompt, num_frames=85).frames[0]
49
-
50
- export_to_video(frames, "mochi.mp4", fps=30)
51
- ```
52
-
53
-
54
- ## Acknowledgements
55
- Original code and models [mochi](https://github.com/genmoai/mochi).
56
-
57
- ## Contacts
58
- <p>Issues should be raised directly in the repository.</p>
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - video
7
+ - genmo
8
+ - diffusers
9
+ pipeline_tag: text-to-video
10
+ library_name: diffusers
11
+ ---
12
+ # 🎥 Distilled Mochi Transformer
13
+ Current repository contains distilled transformer for genmoai mochi-1.
14
+ This transformer consists of 42 blocks vs 48 blocks in original transformer.
15
+
16
+ ### Training details
17
+ We have analized MSE of latent after each block and iteratively dropped blocks which have minimum value of MSE.
18
+
19
+ <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/63fde49f6315a264aba6a7ed/ILt2OOC_La0hcQIedkNhx.mp4"></video>
20
+
21
+ After each block drop we have trained neighboring blocks (one before and one after deleted block) for 1K steps.
22
+
23
+ ### 🚀 Try it here: [Interactive Demo](https://nim.video/create/2855fa68-21b1-4114-b366-53e5e4705ebf?workflow=image2video)
24
+
25
+ ---
26
+
27
+
28
+ ## Usage
29
+ #### Minimal code example
30
+ ```python
31
+ import torch
32
+ from diffusers import MochiPipeline, MochiTransformer3DModel
33
+ from diffusers.utils import export_to_video
34
+
35
+ transformer = MochiTransformer3DModel.from_pretrained(
36
+ "NimVideo/mochi-1-transformer-42",
37
+ torch_dtype=torch.bfloat16,
38
+ )
39
+ pipe = MochiPipeline.from_pretrained(
40
+ "genmo/mochi-1-preview",
41
+ transformer=transformer,
42
+ variant="bf16",
43
+ torch_dtype=torch.bfloat16
44
+ )
45
+
46
+ pipe.enable_model_cpu_offload()
47
+ pipe.enable_vae_tiling()
48
+
49
+ prompt = "Close-up of a chameleon's eye, with its scaly skin changing color. Ultra high resolution 4k."
50
+ frames = pipe(prompt, num_frames=85).frames[0]
51
+
52
+ export_to_video(frames, "mochi.mp4", fps=30)
53
+ ```
54
+
55
+
56
+ ## Acknowledgements
57
+ Original code and models [mochi](https://github.com/genmoai/mochi).
58
+
59
+ ## Contacts
60
+ <p>Issues should be raised directly in the repository.</p>