VideoMamba
Collection
State Space Model for Efficient Video Understanding
β’
5 items
β’
Updated
β’
4
VideoMamba is a purely SSM-based model for video understanding.
The primary use of VideoMamba is research on image and video tasks, e.g., image classification, action recognition, long-term video understanding, and video-text retrieval, with an SSM-based backbone. The primary intended users of the model are researchers and hobbyists in computer vision, machine learning, and artificial intelligence.
@misc{li2024videomamba,
title={VideoMamba: State Space Model for Efficient Video Understanding},
author={Kunchang Li and Xinhao Li and Yi Wang and Yinan He and Yali Wang and Limin Wang and Yu Qiao},
year={2024},
eprint={2403.06977},
archivePrefix={arXiv},
primaryClass={cs.CV}
}