Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
yfliao
/
Qwen-2.5-7B-Simple-RL
like
0
Text Generation
Transformers
Safetensors
qwen2
Generated from Trainer
trl
grpo
conversational
text-generation-inference
Inference Endpoints
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
Qwen-2.5-7B-Simple-RL
/
model-00004-of-00004.safetensors
Commit History
Model save
2be9aae
verified
yfliao
commited on
about 8 hours ago
Model save
6672d45
verified
yfliao
commited on
about 18 hours ago
Model save
af15e4e
verified
yfliao
commited on
about 22 hours ago
Model save
3e91657
verified
yfliao
commited on
1 day ago
Model save
2f3998e
verified
yfliao
commited on
1 day ago