Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
XueyingJia
/
pythia-1b-deduped-tldr-online-dpo
like
0
Transformers
TensorBoard
Safetensors
XueyingJia/online_dpo_repo
Generated from Trainer
trl
online-dpo
Inference Endpoints
arxiv:
2402.04792
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
main
pythia-1b-deduped-tldr-online-dpo
Commit History
End of training
6435e70
verified
XueyingJia
commited on
Nov 24, 2024
Model save
42f558f
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 1875
8b5b109
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 1692
78a6c00
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 1504
6a2d7c8
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 1316
3d9d5fa
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 1128
27769e0
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 940
b9bcfd0
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 752
8bb5d8a
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 564
b8c64f5
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 376
1770fbf
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 188
8b3cfaf
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 4378
e35b4b3
verified
XueyingJia
commited on
Nov 23, 2024
Training in progress, step 2189
c656470
verified
XueyingJia
commited on
Nov 23, 2024
initial commit
605e86c
verified
XueyingJia
commited on
Nov 23, 2024