mtgv
/

VisionLLaMA-Large-MAE

Image Classification

Model card Files Files and versions Community

mtgv commited on Mar 12

Commit

08930fd

•

1 Parent(s): a42591d

Update README.md

Files changed (1) hide show

README.md +27 -0

README.md CHANGED Viewed

@@ -1,3 +1,30 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+datasets:
+- imagenet-1k
+metrics:
+- accuracy
+pipeline_tag: image-classification
 ---
+# VisionLLaMA-Base-MAE
+With the Masked Autoencoders' paradigm, VisionLLaMA-Large-MAE model is trained on ImageNet-1K without labels. It retains improvements over classification tasks (SFT, linear probing) on ImageNet-1K.
+| Model |  ImageNet Acc (SFT) |  ImageNet Acc (Linear Probe) |
+| -- | -- | --|
+| VisionLLaMA-Large-MAE (ep800) |85.5 | 77.3 |
+# How to Use
+Please refer the [Github](https://github.com/Meituan-AutoML/VisionLLaMA) page for usage.
+# Citation
+```
+@article{chu2024visionllama,
+  title={VisionLLaMA: A Unified LLaMA Interface for Vision Tasks},
+  author={Chu, Xiangxiang and Su, Jianlin and Zhang, Bo and Shen, Chunhua},
+  journal={arXiv preprint arXiv:2403.00522},
+  year={2024}
+}
+```