|
--- |
|
base_model: THUDM/CogVideoX-5b |
|
datasets: finetrainers/cakeify-smol |
|
library_name: diffusers |
|
license: other |
|
license_link: https://huggingface.co./THUDM/CogVideoX-5b/blob/main/LICENSE |
|
instance_prompt: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful. |
|
widget: |
|
- text: PIKA_CAKEIFY A blue soap is placed on a modern table. Suddenly, a knife appears and slices through the soap, revealing a cake inside. The soap turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful. |
|
output: |
|
url: "./assets/output_0.mp4" |
|
- text: PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary. |
|
output: |
|
url: "./assets/output_1.mp4" |
|
- text: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful. |
|
output: |
|
url: "./assets/output_2.mp4" |
|
tags: |
|
- text-to-video |
|
- diffusers-training |
|
- diffusers |
|
- cogvideox |
|
- cogvideox-diffusers |
|
- template:sd-lora |
|
--- |
|
|
|
<Gallery /> |
|
|
|
This is a fine-tune of the [THUDM/CogVideoX-5b](https://huggingface.co./THUDM/CogVideoX-5b) model on the |
|
[finetrainers/cakeify-smol](https://huggingface.co./datasets/finetrainers/cakeify-smol) dataset. We also provide |
|
a LoRA variant of the params. Check it out [here](#lora). |
|
|
|
Code: https://github.com/a-r-r-o-w/finetrainers |
|
|
|
> [!IMPORTANT] |
|
> This is an experimental checkpoint and its poor generalization is well-known. |
|
|
|
Inference code: |
|
|
|
```py |
|
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline |
|
from diffusers.utils import export_to_video |
|
import torch |
|
|
|
transformer = CogVideoXTransformer3DModel.from_pretrained( |
|
"finetrainers/cakeify-v0", torch_dtype=torch.bfloat16 |
|
) |
|
pipeline = DiffusionPipeline.from_pretrained( |
|
"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16 |
|
).to("cuda") |
|
|
|
prompt = """ |
|
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary. |
|
""" |
|
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs" |
|
|
|
video = pipeline( |
|
prompt=prompt, |
|
negative_prompt=negative_prompt, |
|
num_frames=81, |
|
height=512, |
|
width=768, |
|
num_inference_steps=50 |
|
).frames[0] |
|
export_to_video(video, "output.mp4", fps=25) |
|
``` |
|
|
|
Training logs are available on WandB [here](https://wandb.ai/diffusion-guidance/finetrainers-cogvideox/runs/q7z660f3/). |
|
|
|
## LoRA |
|
|
|
We extracted a 64-rank LoRA from the finetuned checkpoint (script [here](./create_lora.py)). [This LoRA](./extracted_cakeify_lora_64.safetensors) can be used to emulate the same kind of effect: |
|
|
|
<details> |
|
<summary>Code</summary> |
|
|
|
```py |
|
from diffusers import DiffusionPipeline |
|
from diffusers.utils import export_to_video |
|
import torch |
|
|
|
pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda") |
|
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors") |
|
|
|
prompt = """ |
|
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary. |
|
""" |
|
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs" |
|
|
|
video = pipeline( |
|
prompt=prompt, |
|
negative_prompt=negative_prompt, |
|
num_frames=81, |
|
height=512, |
|
width=768, |
|
num_inference_steps=50 |
|
).frames[0] |
|
export_to_video(video, "output_lora.mp4", fps=25) |
|
``` |
|
|
|
</details> |
|
|
|
Below is a comparison between the LoRA and non-LoRA outputs (under same settings and seed): |
|
|
|
<table> |
|
<thead> |
|
<tr> |
|
<th>Full finetune</th> |
|
<th>LoRA</th> |
|
</tr> |
|
</thead> |
|
<tbody> |
|
<tr> |
|
<td> |
|
<video width="320" height="240" controls> |
|
<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/original_output_0.mp4" type="video/mp4"> |
|
Your browser does not support the video tag. |
|
</video> |
|
</td> |
|
<td> |
|
<video width="320" height="240" controls> |
|
<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/output_0.mp4" type="video/mp4"> |
|
Your browser does not support the video tag. |
|
</video> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td> |
|
<video width="320" height="240" controls> |
|
<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/original_output_1.mp4" type="video/mp4"> |
|
Your browser does not support the video tag. |
|
</video> |
|
</td> |
|
<td> |
|
<video width="320" height="240" controls> |
|
<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/output_1.mp4" type="video/mp4"> |
|
Your browser does not support the video tag. |
|
</video> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td> |
|
<video width="320" height="240" controls> |
|
<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/original_output_2.mp4" type="video/mp4"> |
|
Your browser does not support the video tag. |
|
</video> |
|
</td> |
|
<td> |
|
<video width="320" height="240" controls> |
|
<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/output_2.mp4" type="video/mp4"> |
|
Your browser does not support the video tag. |
|
</video> |
|
</td> |
|
</tr> |
|
</tbody> |
|
</table> |
|
|
|
|