Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
FDeRubeis
/
araft_trained_dpo
like
0
PEFT
Safetensors
Generated from Trainer
arxiv:
2210.03629
Model card
Files
Files and versions
Community
Use this model
main
araft_trained_dpo
1 contributor
History:
6 commits
FDeRubeis
Update Readme.md: add links to ReAct and HotpotQA papers
f2941b0
verified
6 months ago
checkpoint-108
Upload folder using huggingface_hub
7 months ago
checkpoint-12
Upload folder using huggingface_hub
7 months ago
checkpoint-120
Upload folder using huggingface_hub
7 months ago
checkpoint-132
Upload folder using huggingface_hub
7 months ago
checkpoint-144
Upload folder using huggingface_hub
7 months ago
checkpoint-24
Upload folder using huggingface_hub
7 months ago
checkpoint-36
Upload folder using huggingface_hub
7 months ago
checkpoint-48
Upload folder using huggingface_hub
7 months ago
checkpoint-60
Upload folder using huggingface_hub
7 months ago
checkpoint-72
Upload folder using huggingface_hub
7 months ago
checkpoint-84
Upload folder using huggingface_hub
7 months ago
checkpoint-96
Upload folder using huggingface_hub
7 months ago
dpo_trained
Upload folder using huggingface_hub
7 months ago
reference
Upload folder using huggingface_hub
7 months ago
.gitattributes
Safe
1.52 kB
initial commit
7 months ago
README.md
Safe
2.12 kB
Update Readme.md: add links to ReAct and HotpotQA papers
6 months ago