Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
jiazhengli
/
Mixtral-8x7B-Instruct-v0.1-QLoRA-Assessment-Rationale-dpo
like
0
PEFT
Safetensors
jiazhengli/Rationale_MCTS
jiazhengli/Synthetic_Rationale
English
llama-factory
lora
Generated from Trainer
arxiv:
2406.19949
License:
other
Model card
Files
Files and versions
Community
Use this model
main
Mixtral-8x7B-Instruct-v0.1-QLoRA-Assessment-Rationale-dpo
Commit History
Update README.md
8f97e33
verified
jiazhengli
commited on
Oct 14, 2024
init push
0ffb2d3
Jiazheng Li
commited on
Jul 6, 2024
initial commit
145c9cc
verified
jiazhengli
commited on
Jul 6, 2024