Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
DAMO-NLP-SG
/
VL3-SigLIP-NaViT
like
4
Follow
Language Technology Lab at Alibaba DAMO Academy
95
Image Feature Extraction
Transformers
Safetensors
English
videollama3_vision_encoder
feature-extraction
visual-encoder
multi-modal-large-language-model
custom_code
arxiv:
2501.13106
arxiv:
2406.07476
arxiv:
2306.02858
License:
apache-2.0
Model card
Files
Files and versions
Community
3
Train
Use this model
c526594
VL3-SigLIP-NaViT
Commit History
Add model card metadata
c526594
verified
nielsr
HF staff
commited on
11 days ago
Upload model
0e04069
verified
ClownRat
commited on
13 days ago
Upload processor
592e852
verified
ClownRat
commited on
13 days ago
initial commit
3eb707c
verified
ClownRat
commited on
13 days ago