Multimodal Preference Data Synthetic Alignment with Reward Model Paper • 2412.17417 • Published Dec 23, 2024 • 1