Pavel Iakubovskii's picture

Pavel Iakubovskii

qubvel-hf

·

AI & ML interests

Computer Vision models

Recent Activity

commented on their article 1 day ago

SigLIP 2: A better multilingual vision language encoder

commented on their article 1 day ago

SigLIP 2: A better multilingual vision language encoder

liked a Space 1 day ago

ariG23498/phi4-multimodal

View all activity

Organizations

qubvel-hf's activity

commented on SigLIP 2: A better multilingual vision language encoder 1 day ago

btw, also observed "." and capitalized template influences the confidence quite a bit

commented on SigLIP 2: A better multilingual vision language encoder 1 day ago

Not sure what's up as I'm not familiar with this codebase (and no time to dig in), but for siglip what you're supposed to do is do sigmoid(zimg @ ztxt * temperature + bias)

from what you describe, I would bet the bias and/or temperature are missing?
The ground-truth reference code is https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/SigLIP2_demo.ipynb

Hey @giffmana , temperature and bias are applied under the hood, see

Siglip
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip/modeling_siglip.py#L1411-L1417

Siglip2
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip2/modeling_siglip2.py#L1459-L1465

liked a Space 1 day ago

Phi4 Multimodal

Space demoing Phi4 MultiModal

New activity in google/siglip2-base-patch16-224 3 days ago

Error while loading processor: TypeError: expected str, bytes or os.PathLike object, not NoneType

#2 opened 7 days ago by

question about 'model_type' in config.json

#5 opened 3 days ago by

upvoted an article 3 days ago

Article

FastRTC: The Real-Time Communication Library for Python

4 days ago

• 97

liked 2 models 3 days ago

google/siglip2-so400m-patch16-naflex

Zero-Shot Image Classification • Updated 8 days ago • 3.29k • 12

google/siglip2-giant-opt-patch16-384

Zero-Shot Image Classification • Updated 8 days ago • 2.34k • 10

liked a Space 3 days ago

AI Deadlines

Generate project deadlines

liked a Space 5 days ago

Compare Siglip1 Siglip2

Compare SigLIP1 and SigLIP2 on zero shot classification

New activity in google/siglip2-base-patch16-224 5 days ago

Missing Vocab file

#4 opened 5 days ago by

upvoted an article 7 days ago

Article

SigLIP 2: A better multilingual vision language encoder

8 days ago

• 113

upvoted a paper 7 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 8 days ago • 118

upvoted a collection 7 days ago

SigLIP2

36 items • Updated 7 days ago • 51

liked a model 7 days ago

google/siglip2-base-patch16-224

Zero-Shot Image Classification • Updated 8 days ago • 8.84k • 25

updated a dataset 7 days ago

huggingface/documentation-images

Viewer • Updated about 23 hours ago • 50 • 4.7M • 52

published an article 8 days ago

Article

SigLIP 2: A better multilingual vision language encoder

8 days ago

• 113

updated 2 models 8 days ago

google/siglip2-base-patch16-naflex

Zero-Shot Image Classification • Updated 8 days ago • 1.74k • 2

google/siglip2-so400m-patch16-naflex

Zero-Shot Image Classification • Updated 8 days ago • 3.29k • 12

liked a model 10 days ago

apple/DepthPro-hf

Depth Estimation • Updated 21 days ago • 12.2k • 28