Update README.md
Browse files
README.md
CHANGED
@@ -18,8 +18,6 @@ Version: 1.0
|
|
18 |
[VC-1 Demo](https://github.com/facebookresearch/eai-vc/blob/main/tutorial/tutorial_vc.ipynb)
|
19 |
|
20 |
The VC-1 model is a vision transformer (ViT) pre-trained on over 4,000 hours of egocentric videos from 7 different sources, together with ImageNet. The model is trained using Masked Auto-Encoding (MAE) and is available in two sizes: ViT-B and ViT-L. The model is intended for use for EmbodiedAI tasks, such as object manipulation and indoor navigation.
|
21 |
-
* VC-1 (ViT-L): Our best model, uses a ViT-L backbone, also known simply as `VC-1` | [Download](https://dl.fbaipublicfiles.com/eai-vc/vc1_vitl.pth)
|
22 |
-
* VC-1-base (VIT-B): pre-trained on the same data as VC-1 but with a smaller backbone (ViT-B) | [Download](https://dl.fbaipublicfiles.com/eai-vc/vc1_vitb.pth)
|
23 |
|
24 |
## Model Details
|
25 |
|
|
|
18 |
[VC-1 Demo](https://github.com/facebookresearch/eai-vc/blob/main/tutorial/tutorial_vc.ipynb)
|
19 |
|
20 |
The VC-1 model is a vision transformer (ViT) pre-trained on over 4,000 hours of egocentric videos from 7 different sources, together with ImageNet. The model is trained using Masked Auto-Encoding (MAE) and is available in two sizes: ViT-B and ViT-L. The model is intended for use for EmbodiedAI tasks, such as object manipulation and indoor navigation.
|
|
|
|
|
21 |
|
22 |
## Model Details
|
23 |
|