LoRA text2image fine-tuning - Bhaskar009/SD_1.5_LoRA

These are LoRA adaption weights for stable-diffusion-v1-5/stable-diffusion-v1-5. The weights were fine-tuned on the lambdalabs/naruto-blip-captions dataset. You can find some example images in the following.

Intended uses & limitations

How to use

import torch
import matplotlib.pyplot as plt
from diffusers import DiffusionPipeline

# Load the model and move it to GPU (CUDA)
pipe = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5").to("cuda")

# Load the fine-tuned LoRA weights
pipe.load_lora_weights("Bhaskar009/SD_1.5_LoRA")

# moving to cuda
pipe.to("cuda")

# Define a Naruto-themed prompt
prompt = "A detailed anime-style portrait of Naruto Uzumaki, wearing his Hokage cloak, standing under a bright sunset, ultra-detailed, cinematic lighting, 8K"

# Generate the image
image = pipe(prompt).images[0]

# Display the image using matplotlib
plt.figure(figsize=(6, 6))
plt.imshow(image)
plt.axis("off")  # Hide axes for a clean view
plt.show()

Limitations and bias

[TODO: provide examples of latent issues and potential remediations]

Training details - Stable Diffusion LoRA

Dataset

-The model was trained using the 'lambdalabs/naruto-blip-captions' dataset. -This dataset consists of Naruto character images with BLIP-generated captions. -It provides a diverse set of characters, poses, and backgrounds, -making it suitable for fine-tuning Stable Diffusion on anime-style images.

Model

-Base Model: Stable Diffusion v1.5 (stable-diffusion-v1-5/stable-diffusion-v1-5) -Fine-tuning Method: LoRA (Low-Rank Adaptation) -Purpose: Specializing Stable Diffusion to generate Naruto-style anime characters.

Preprocessing

Images were resized to 512x512 resolution.
Center cropping was applied to maintain aspect ratio.
Random flipping was used as a data augmentation technique.

Training Configuration

-Batch Size: 1 -Gradient Accumulation Steps: 4 # Simulates a larger batch size -Gradient Checkpointing: Enabled # Reduces memory consumption -Max Training Steps: 800 -Learning Rate: 1e-5 (constant schedule, no warmup) -Max Gradient Norm: 1 # Prevents gradient explosion -Memory Optimization: xFormers enabled for efficient attention computation

Validation

A validation prompt "A Naruto character" was used.
4 validation images were generated during training.
Model checkpoints were saved every 500 steps.

Model Output

The fine-tuned LoRA model was saved to "sd-naruto-model".
The model was pushed to the Hugging Face Hub:
Repository: Bhaskar009/SD_1.5_LoRA

Bhaskar009
/

SD_1.5_LoRA