txiong23 commited on
Commit
7e3c011
1 Parent(s): 14f5f83

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -117,6 +117,7 @@ print(text_outputs)
117
  - **Mid Stage:** A mixture of 4.7M high-quality synthetic data, 1 epoch, full model
118
  - **Final-Image Stage:** A mixture of 3.6M single-image data, 1 epoch, full model
119
  - **OneVision Stage:** A mixture of 1.6M single-image/multi-image/video data, 1 epoch, full model
 
120
  - **Precision:** bfloat16
121
 
122
  ## Hardware & Software
@@ -131,4 +132,14 @@ print(text_outputs)
131
  @article{li2024llavaonevision,
132
  title={LLaVA-OneVision},
133
  }
 
 
 
 
 
 
 
 
 
 
134
  ```
 
117
  - **Mid Stage:** A mixture of 4.7M high-quality synthetic data, 1 epoch, full model
118
  - **Final-Image Stage:** A mixture of 3.6M single-image data, 1 epoch, full model
119
  - **OneVision Stage:** A mixture of 1.6M single-image/multi-image/video data, 1 epoch, full model
120
+ - **Critic / Preference Learning Stage:** 9.4k question-image input from [LLaVA-RLHF](https://llava-rlhf.github.io/) with self-generated responses, reward signal from [llava-critic-72b](https://huggingface.co/lmms-lab/llava-critic-72b), iterative DPO for 3 rounds, full model
121
  - **Precision:** bfloat16
122
 
123
  ## Hardware & Software
 
132
  @article{li2024llavaonevision,
133
  title={LLaVA-OneVision},
134
  }
135
+
136
+ @article{xiong2024llavacritic,
137
+ title={LLaVA-Critic: Learning to Evaluate Multimodal Models},
138
+ author={Xiong, Tianyi and Wang, Xiyao and Guo, Dong and Ye, Qinghao and Fan, Haoqi and Gu, Quanquan and Huang, Heng and Li, Chunyuan},
139
+ year={2024},
140
+ eprint={2410.02712},
141
+ archivePrefix={arXiv},
142
+ primaryClass={cs.CV},
143
+ url={https://arxiv.org/abs/2410.02712},
144
+ }
145
  ```