Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

jarvis's picture

4 13

jarvis

jarvis8x7b

AI & ML interests

None yet

Organizations

None yet

Collections 1

A General Theoretical Paradigm to Understand Learning from Human Preferences

Paper • 2310.12036 • Published Oct 18, 2023 • 14
ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 62
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 48

models

None public yet

datasets

None public yet

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs