LoRA text2image fine-tuning - Bhaskar009/SD_1.5_LoRA
These are LoRA adaption weights for stable-diffusion-v1-5/stable-diffusion-v1-5. The weights were fine-tuned on the lambdalabs/naruto-blip-captions dataset. You can find some example images in the following.
Intended uses & limitations
How to use
import torch
import matplotlib.pyplot as plt
from diffusers import DiffusionPipeline
# Load the model and move it to GPU (CUDA)
pipe = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5").to("cuda")
# Load the fine-tuned LoRA weights
pipe.load_lora_weights("Bhaskar009/SD_1.5_LoRA")
# moving to cuda
pipe.to("cuda")
# Define a Naruto-themed prompt
prompt = "A detailed anime-style portrait of Naruto Uzumaki, wearing his Hokage cloak, standing under a bright sunset, ultra-detailed, cinematic lighting, 8K"
# Generate the image
image = pipe(prompt).images[0]
# Display the image using matplotlib
plt.figure(figsize=(6, 6))
plt.imshow(image)
plt.axis("off") # Hide axes for a clean view
plt.show()
Limitations and bias
[TODO: provide examples of latent issues and potential remediations]
Training details - Stable Diffusion LoRA
Dataset
-The model was trained using the 'lambdalabs/naruto-blip-captions' dataset. -This dataset consists of Naruto character images with BLIP-generated captions. -It provides a diverse set of characters, poses, and backgrounds, -making it suitable for fine-tuning Stable Diffusion on anime-style images.
Model
-Base Model: Stable Diffusion v1.5 (stable-diffusion-v1-5/stable-diffusion-v1-5) -Fine-tuning Method: LoRA (Low-Rank Adaptation) -Purpose: Specializing Stable Diffusion to generate Naruto-style anime characters.
Preprocessing
- Images were resized to 512x512 resolution.
- Center cropping was applied to maintain aspect ratio.
- Random flipping was used as a data augmentation technique.
Training Configuration
-Batch Size: 1 -Gradient Accumulation Steps: 4 # Simulates a larger batch size -Gradient Checkpointing: Enabled # Reduces memory consumption -Max Training Steps: 800 -Learning Rate: 1e-5 (constant schedule, no warmup) -Max Gradient Norm: 1 # Prevents gradient explosion -Memory Optimization: xFormers enabled for efficient attention computation
Validation
- A validation prompt "A Naruto character" was used.
- 4 validation images were generated during training.
- Model checkpoints were saved every 500 steps.
Model Output
- The fine-tuned LoRA model was saved to "sd-naruto-model".
- The model was pushed to the Hugging Face Hub:
- Repository: Bhaskar009/SD_1.5_LoRA
- Downloads last month
- 6
Model tree for Bhaskar009/SD_1.5_LoRA
Base model
stable-diffusion-v1-5/stable-diffusion-v1-5