cakeify-v0 / README.md

Update README.md

909d679 verified about 1 month ago

6.53 kB

	---
	base_model: THUDM/CogVideoX-5b
	datasets: finetrainers/cakeify-smol
	library_name: diffusers
	license: other
	license_link: https://huggingface.co./THUDM/CogVideoX-5b/blob/main/LICENSE
	instance_prompt: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.
	widget:
	- text: PIKA_CAKEIFY A blue soap is placed on a modern table. Suddenly, a knife appears and slices through the soap, revealing a cake inside. The soap turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.
	output:
	url: "./assets/output_0.mp4"
	- text: PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
	output:
	url: "./assets/output_1.mp4"
	- text: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.
	output:
	url: "./assets/output_2.mp4"
	tags:
	- text-to-video
	- diffusers-training
	- diffusers
	- cogvideox
	- cogvideox-diffusers
	- template:sd-lora
	---

	<Gallery />

	This is a fine-tune of the [THUDM/CogVideoX-5b](https://huggingface.co./THUDM/CogVideoX-5b) model on the
	[finetrainers/cakeify-smol](https://huggingface.co./datasets/finetrainers/cakeify-smol) dataset. We also provide
	a LoRA variant of the params. Check it out [here](#lora).

	Code: https://github.com/a-r-r-o-w/finetrainers

	> [!IMPORTANT]
	> This is an experimental checkpoint and its poor generalization is well-known.

	Inference code:

	```py
	from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline
	from diffusers.utils import export_to_video
	import torch

	transformer = CogVideoXTransformer3DModel.from_pretrained(
	"finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
	)
	pipeline = DiffusionPipeline.from_pretrained(
	"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
	).to("cuda")

	prompt = """
	PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
	"""
	negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

	video = pipeline(
	prompt=prompt,
	negative_prompt=negative_prompt,
	num_frames=81,
	height=512,
	width=768,
	num_inference_steps=50
	).frames[0]
	export_to_video(video, "output.mp4", fps=25)
	```

	Training logs are available on WandB [here](https://wandb.ai/diffusion-guidance/finetrainers-cogvideox/runs/q7z660f3/).

	## LoRA

	We extracted a 64-rank LoRA from the finetuned checkpoint (script [here](./create_lora.py)). [This LoRA](./extracted_cakeify_lora_64.safetensors) can be used to emulate the same kind of effect:

	<details>
	<summary>Code</summary>

	```py
	from diffusers import DiffusionPipeline
	from diffusers.utils import export_to_video
	import torch

	pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
	pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors")

	prompt = """
	PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
	"""
	negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

	video = pipeline(
	prompt=prompt,
	negative_prompt=negative_prompt,
	num_frames=81,
	height=512,
	width=768,
	num_inference_steps=50
	).frames[0]
	export_to_video(video, "output_lora.mp4", fps=25)
	```

	</details>

	Below is a comparison between the LoRA and non-LoRA outputs (under same settings and seed):

	<table>
	<thead>
	<tr>
	<th>Full finetune</th>
	<th>LoRA</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td>
	<video width="320" height="240" controls>
	<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/original_output_0.mp4" type="video/mp4">
	Your browser does not support the video tag.
	</video>
	</td>
	<td>
	<video width="320" height="240" controls>
	<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/output_0.mp4" type="video/mp4">
	Your browser does not support the video tag.
	</video>
	</td>
	</tr>
	<tr>
	<td>
	<video width="320" height="240" controls>
	<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/original_output_1.mp4" type="video/mp4">
	Your browser does not support the video tag.
	</video>
	</td>
	<td>
	<video width="320" height="240" controls>
	<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/output_1.mp4" type="video/mp4">
	Your browser does not support the video tag.
	</video>
	</td>
	</tr>
	<tr>
	<td>
	<video width="320" height="240" controls>
	<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/original_output_2.mp4" type="video/mp4">
	Your browser does not support the video tag.
	</video>
	</td>
	<td>
	<video width="320" height="240" controls>
	<source src="https://huggingface.co./finetrainers/cakeify-v0/resolve/main/comparisons/output_2.mp4" type="video/mp4">
	Your browser does not support the video tag.
	</video>
	</td>
	</tr>
	</tbody>
	</table>