ViTamin Family - a jienengchen Collection

jienengchen 's Collections

updated Apr 11, 2024

Designing Scalable Vision Models in the Vision-language Era. The best performing model is 'jienengchen/ViTamin-XL-384px'.

Upvote

jienengchen/ViTamin-XL-384px

Feature Extraction • Updated Apr 8, 2024 • 202 • 19
Note ViTamin-XL, with only 436M parameters and trained on the public DataComp-1B dataset, achieves an impressive 82.9% 🔥 zero-shot ImageNet accuracy.
jienengchen/ViTamin-L-336px

Feature Extraction • Updated Apr 8, 2024 • 107 • 4
Note ViTamin-L, with 333M parameters, sets a new SOTA 🔥 across seven benchmarks for open-vocabulary segmentation, and also push forward the capabilities of large multi-modal models 🌋 significantly.
ViTamin: Designing Scalable Vision Models in the Vision-Language Era

Paper • 2404.02132 • Published Apr 2, 2024 • 2
jienengchen/ViTamin-XL-336px

Feature Extraction • Updated Apr 19, 2024 • 3 • 1
jienengchen/ViTamin-XL-256px

Feature Extraction • Updated May 3, 2024 • 5
jienengchen/ViTamin-L2-384px

Feature Extraction • Updated Apr 8, 2024 • 103
jienengchen/ViTamin-L2-336px

Feature Extraction • Updated Apr 8, 2024 • 102
jienengchen/ViTamin-L2-256px

Feature Extraction • Updated Apr 8, 2024 • 114
jienengchen/ViTamin-L-384px

Feature Extraction • Updated Apr 19, 2024 • 66 • 1
jienengchen/ViTamin-L-256px

Feature Extraction • Updated Apr 19, 2024 • 3
jienengchen/ViTamin-L-224px

Feature Extraction • Updated Apr 19, 2024 • 77
jienengchen/ViTamin-B-LTT

Feature Extraction • Updated Apr 8, 2024 • 103

Note achieves 70.8% zero-shot ImageNet accuracy with 88M parameters.
jienengchen/ViTamin-S-LTT

Feature Extraction • Updated Apr 8, 2024 • 102

Note achieves 63.4% zero-shot ImageNet accuracy with 22M parameters.
jienengchen/ViTamin-B

Feature Extraction • Updated Apr 8, 2024 • 102

Note achieves 68.9% zero-shot ImageNet accuracy with 88M parameters.
jienengchen/ViTamin-S

Feature Extraction • Updated Apr 8, 2024 • 101

Note achieves 62.2% zero-shot ImageNet accuracy with 22M parameters.
jienengchen/ViTamin-L2-224px

Feature Extraction • Updated Apr 19, 2024 • 4

Upvote