Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,28 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions
|
2 |
+
|
3 |
+
A framework designed to generate semantically rich image captions.
|
4 |
+
|
5 |
+
## Resources
|
6 |
+
|
7 |
+
- 💻 **Project Page**: For more details, visit the official [project page](https://rotsteinnoam.github.io/FuseCap/).
|
8 |
+
|
9 |
+
- 📝 **Read the Paper**: You can find the paper [here](https://arxiv.org/abs/2305.17718).
|
10 |
+
|
11 |
+
- 🚀 **Demo**: Try out our BLIP-based model [demo](https://huggingface.co/spaces/noamrot/FuseCap) trained using FuseCap, hosted on Huggingface Spaces.
|
12 |
+
|
13 |
+
## Upcoming Updates
|
14 |
+
|
15 |
+
The official codebase and trained models for this project will be released soon.
|
16 |
+
|
17 |
+
## BibTeX
|
18 |
+
|
19 |
+
``` Citation
|
20 |
+
@misc{rotstein2023fusecap,
|
21 |
+
title={FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions},
|
22 |
+
author={Noam Rotstein and David Bensaid and Shaked Brody and Roy Ganz and Ron Kimmel},
|
23 |
+
year={2023},
|
24 |
+
eprint={2305.17718},
|
25 |
+
archivePrefix={arXiv},
|
26 |
+
primaryClass={cs.CV}
|
27 |
+
}
|
28 |
+
```
|