Siglip2 Custom
Collection
SiglipForImageClassification
•
5 items
•
Updated
•
11
Guard-Against-Unsafe-Content-Siglip2 is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to detect NSFW content, including vulgarity and nudity, using the SiglipForImageClassification architecture.
The model categorizes images into two classes:
!pip install -q transformers torch pillow gradio
import gradio as gr
from transformers import AutoImageProcessor
from transformers import SiglipForImageClassification
from transformers.image_utils import load_image
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/Guard-Against-Unsafe-Content-Siglip2"
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
def nsfw_detection(image):
"""Predicts NSFW probability scores for an image."""
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
labels = model.config.id2label
predictions = {labels[i]: round(probs[i], 3) for i in range(len(probs))}
return predictions
# Create Gradio interface
iface = gr.Interface(
fn=nsfw_detection,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(label="NSFW Content Detection"),
title="NSFW Image Detection",
description="Upload an image to check if it contains unsafe content such as vulgarity or nudity."
)
# Launch the app
if __name__ == "__main__":
iface.launch()
TrainOutput(global_step=376, training_loss=0.11756020403922872, metrics={'train_runtime': 597.6963, 'train_samples_per_second': 20.077, 'train_steps_per_second': 0.629, 'total_flos': 1.005065949855744e+18, 'train_loss': 0.11756020403922872, 'epoch': 2.0})
The Guard-Against-Unsafe-Content-Siglip2 model is designed to detect inappropriate and explicit content in images. It helps distinguish between safe and unsafe images based on the presence of vulgarity, nudity, or other NSFW elements.
Base model
google/siglip2-base-patch16-224