File size: 2,030 Bytes
3e8ca64 c4bb5e9 3e8ca64 e4c187f 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 e4c187f 3e8ca64 c4bb5e9 3e8ca64 c4bb5e9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
library_name: transformers
license: mit
language:
- en
pipeline_tag: object-detection
base_model:
- hustvl/yolos-tiny
tags:
- object-detection
- fashion
- search
---
This model is fine-tuned version of hustvl/yolos-tiny.
You can find details of model in this github repo -> [fashion-visual-search](https://github.com/yainage90/fashion-visual-search)
And you can find fashion image feature extractor model -> [yainage90/fashion-image-feature-extractor](https://huggingface.co./yainage90/fashion-image-feature-extractor)
This model was trained using a combination of two datasets: [modanet](https://github.com/eBay/modanet) and [fashionpedia](https://fashionpedia.github.io/home/)
The labels are ['bag', 'bottom', 'dress', 'hat', 'shoes', 'outer', 'top']
In the 96th epoch out of total of 100 epochs, the best score was achieved with mAP 0.697400.
``` python
from PIL import Image
import torch
from transformers import YolosImageProcessor, YolosForObjectDetection
device = 'cpu'
if torch.cuda.is_available():
device = torch.device('cuda')
elif torch.backends.mps.is_available():
device = torch.device('mps')
ckpt = 'yainage90/fashion-object-detection-yolos-tiny'
image_processor = YolosImageProcessor.from_pretrained(ckpt)
model = YolosForObjectDetection.from_pretrained(ckpt).to(device)
image = Image.open('<path/to/image>').convert('RGB')
with torch.no_grad():
inputs = image_processor(images=[image], return_tensors="pt")
outputs = model(**inputs.to(device))
target_sizes = torch.tensor([[image.size[1], image.size[0]]])
results = image_processor.post_process_object_detection(outputs, threshold=0.85, target_sizes=target_sizes)[0]
items = []
for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
score = score.item()
label = label.item()
box = [i.item() for i in box]
print(f"{model.config.id2label[label]}: {round(score, 3)} at {box}")
items.append((score, label, box))
```
![sample_image](sample_image.png) |