fede97 commited on
Commit
4acda93
·
verified ·
1 Parent(s): 864bd53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -3
README.md CHANGED
@@ -1,10 +1,26 @@
1
  ---
2
  library_name: transformers
3
  pipeline_tag: image-text-to-text
 
4
  ---
5
- # Model Card: Reflective LLaVA (ReflectiVA)
6
 
7
- ```ReflectiVA```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  ## Citation
10
  If you make use of our work, please cite our repo:
@@ -16,4 +32,8 @@ If you make use of our work, please cite our repo:
16
  journal={arXiv},
17
  year={2024}
18
  }
19
- ```
 
 
 
 
 
1
  ---
2
  library_name: transformers
3
  pipeline_tag: image-text-to-text
4
+ license: apache-2.0
5
  ---
6
+ # Model Card: Reflective LLaVA (ReflectiVA)
7
 
8
+ Multimodal LLMs (MLLMs) are the natural extension of large language models to handle multimodal inputs, combining text and image data.
9
+ They have recently garnered attention due to their capability to address complex tasks involving both modalities.
10
+ However, their effectiveness is limited to the knowledge acquired during training, which restricts their practical utility.
11
+ In this work, we introduce a novel method to enhance the adaptability of MLLMs by integrating external knowledge sources.
12
+ Our proposed model, Reflective LLaVA (```ReflectiVA```), utilizes reflective tokens to dynamically determine the need for external knowledge
13
+ and predict the relevance of information retrieved from an external database.
14
+ Tokens are trained following a two-stage two-model training recipe. This ultimately enables the MLLM to manage external knowledge
15
+ while preserving fluency and performance on tasks where external knowledge is not needed.
16
+
17
+ The efficacy of ```ReflectiVA``` for knowledge-based visual question answering, highlighting its
18
+ superior performance compared to existing methods.
19
+
20
+
21
+ In this model space, you will find the Overall Model (stage two) weights of ```ReflectiVA```.
22
+
23
+ For more information, visit our [ReflectiVA repository](https://github.com/aimagelab/ReflectiVA).
24
 
25
  ## Citation
26
  If you make use of our work, please cite our repo:
 
32
  journal={arXiv},
33
  year={2024}
34
  }
35
+ ```
36
+
37
+ ## Paper page
38
+
39
+ Paper can be found at https://huggingface.co/papers/2411.16863.