-
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 24 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 47 -
Diffusion Model Alignment Using Direct Preference Optimization
Paper • 2311.12908 • Published • 47 -
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
Paper • 2312.00849 • Published • 8
Collections
Discover the best community collections!
Collections including paper arxiv:2311.12908
-
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Paper • 2309.03895 • Published • 13 -
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Paper • 2309.16650 • Published • 10 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 9 -
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling
Paper • 2310.15169 • Published • 9