Add image-text-to-text pipeline tag
Browse filesThis PR updates the model card metadata to use the `image-text-to-text` pipeline tag. This tag better reflects the model's multimodal capabilities, including image captioning and visual question answering, as demonstrated in the provided examples and described in the paper. This change improves the model's discoverability on the Hub for users seeking vision-language models.
README.md
CHANGED
@@ -1,19 +1,18 @@
|
|
1 |
---
|
|
|
2 |
license: apache-2.0
|
|
|
3 |
tags:
|
4 |
- vision
|
5 |
widget:
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
example_title: Bee
|
10 |
-
library_name: transformers
|
11 |
-
pipeline_tag: zero-shot-image-classification
|
12 |
---
|
13 |
|
14 |
# SigLIP 2 Giant
|
15 |
|
16 |
-
[SigLIP 2](https://
|
17 |
[SigLIP](https://huggingface.co/papers/2303.15343) with prior, independently developed techniques
|
18 |
into a unified recipe, for improved semantic understanding, localization, and dense features.
|
19 |
|
@@ -99,4 +98,4 @@ Evaluation of SigLIP 2 is shown below (taken from the paper).
|
|
99 |
primaryClass={cs.CV},
|
100 |
url={https://arxiv.org/abs/2502.14786},
|
101 |
}
|
102 |
-
```
|
|
|
1 |
---
|
2 |
+
library_name: transformers
|
3 |
license: apache-2.0
|
4 |
+
pipeline_tag: image-text-to-text
|
5 |
tags:
|
6 |
- vision
|
7 |
widget:
|
8 |
+
- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg
|
9 |
+
candidate_labels: bee in the sky, bee on the flower
|
10 |
+
example_title: Bee
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
# SigLIP 2 Giant
|
14 |
|
15 |
+
[SigLIP 2](https://hf.co/papers/2502.14786) extends the pretraining objective of
|
16 |
[SigLIP](https://huggingface.co/papers/2303.15343) with prior, independently developed techniques
|
17 |
into a unified recipe, for improved semantic understanding, localization, and dense features.
|
18 |
|
|
|
98 |
primaryClass={cs.CV},
|
99 |
url={https://arxiv.org/abs/2502.14786},
|
100 |
}
|
101 |
+
```
|