metadata
license: mit
library_name: transformers
pipeline_tag: image-text-to-text
๐ Paper โข ๐ Demo โข ๐ LongLLaVA
๐ Update
- [2024.09.05] LongLLaVA repo is published๏ผ๐
Architecture
Results
Evaluation and demo
Coming Soon~
To do
- [] Release inference code
Citation
@misc{wang2024longllavascalingmultimodalllms,
title={LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture},
author={Xidong Wang and Dingjie Song and Shunian Chen and Chen Zhang and Benyou Wang},
year={2024},
eprint={2409.02889},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2409.02889},
}