mtgv
/

VisionLLaMA-Base-MAE

Image Classification

Model card Files Files and versions Community

VisionLLaMA-Base-MAE / README.md

mtgv's picture

add model card

211026e verified 8 months ago

|

712 Bytes

	---
	license: apache-2.0
	datasets:
	- imagenet-1k
	- ade20k
	metrics:
	- accuracy
	- mIoU
	pipeline_tag: image-classification
	---
	# VisionLLaMA-Base-MAE

	With the Masked Autoencoders' paradigm, VisionLLaMA-Base-MAE model is trained on ImageNet-1k without labels. It manifests substantial improvements over classification tasks (SFT, linear probing) on ImageNet-1K and the segmentation task on ADE20K.

	\| Model \| ImageNet Acc (SFT) \| ImageNet Acc (Linear Probe) \| ADE20K Segmentation \|
	\| -- \| -- \| --\| --\|
	\| VisionLLaMA-Base-MAE (ep800) \|84.0 \|69.7 \|49.0 \|
	\| VisionLLaMA-Base-MAE (ep1600) \|84.3 \| 71.7\| 50.2 \|




	# How to Use

	Please refer the [Github](https://github.com/Meituan-AutoML/VisionLLaMA) page for usage.