---
license: apache-2.0
extra_gated_prompt: "You agree to not use the model to conduct experiments that cause harm to human subjects."
extra_gated_fields:
  Name: text
  Company/Organization: text
  Country: text
  E-Mail: text
---

# Model Card for InternVideo2

This modelcard aims to give the model info of 'InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding'.

## Model Details

### Model Sources

- **Repository:** [InternVideo2](https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo2)
- **Paper:** [2403.15377](https://arxiv.org/abs/2403.15377)
- **Point of Contact:** mailto:[InternVideo Group](gvx-sh@pjlab.org.cn)

## Citation

If you find this work useful for your research, please consider citing InternVid. Your acknowledgement would greatly help us in continuing to contribute resources to the research community.

```

@article{wang2024internvideo2,
  title={InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding},
  author={Wang, Yi and Li, Kunchang and Li, Xinhao and Yu, Jiashuo and He, Yinan and Chen, Guo and Pei, Baoqi and Zheng, Rongkun and Xu, Jilan and Wang, Zun and others},
  journal={arXiv preprint arXiv:2403.15377},
  year={2024}
}

@article{wang2022internvideo,
  title={InternVideo: General Video Foundation Models via Generative and Discriminative Learning},
  author={Wang, Yi and Li, Kunchang and Li, Yizhuo and He, Yinan and Huang, Bingkun and Zhao, Zhiyu and Zhang, Hongjie and Xu, Jilan and Liu, Yi and Wang, Zun and Xing, Sen and Chen, Guo and Pan, Junting and Yu, Jiashuo and Wang, Yali and Wang, Limin and Qiao, Yu},
  journal={arXiv preprint arXiv:2212.03191},
  year={2022}
}
```