Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
trl-lib
's Collections
Preference datasets
Stepwise supervision datasets
Prompt-completion datasets
Prompt-only datasets
Unpaired preference datasets
Comparing DPO with IPO and KTO
Online-DPO
Preference datasets
updated
3 days ago
Upvote
-
trl-lib/hh-rlhf-helpful-base
Viewer
•
Updated
3 days ago
•
46.2k
•
63
trl-lib/lm-human-preferences-descriptiveness
Viewer
•
Updated
3 days ago
•
6.26k
•
32
•
1
trl-lib/lm-human-preferences-sentiment
Viewer
•
Updated
3 days ago
•
6.26k
•
29
trl-lib/rlaif-v
Viewer
•
Updated
3 days ago
•
83.1k
•
143
•
3
trl-lib/tldr-preference
Viewer
•
Updated
3 days ago
•
179k
•
258
trl-lib/ultrafeedback_binarized
Viewer
•
Updated
Sep 12, 2024
•
63.1k
•
4.87k
•
6
Upvote
-
Share collection
View history
Collection guide
Browse collections