Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
yfliao
/
Qwen-2.5-7B-Simple-RL
like
0
Text Generation
Transformers
Safetensors
qwen2
Generated from Trainer
trl
grpo
conversational
text-generation-inference
Inference Endpoints
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
Qwen-2.5-7B-Simple-RL
Commit History
Model save
2be9aae
verified
yfliao
commited on
about 5 hours ago
Model save
6672d45
verified
yfliao
commited on
about 15 hours ago
Model save
af15e4e
verified
yfliao
commited on
about 20 hours ago
Model save
3e91657
verified
yfliao
commited on
1 day ago
Model save
2f3998e
verified
yfliao
commited on
1 day ago
initial commit
fd68c88
verified
yfliao
commited on
11 days ago