VIRL-VL-Init

This model serves as a initial checkpoint to reproduce results in paper SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training.

Related links

Website: https://tianzhechu.com/SFTvsRL/

Github: https://github.com/LeslieTrue/SFTvsRL

Arxiv: https://arxiv.org/abs/2501.17161v1

HF: https://huggingface.co./papers/2501.17161

Downloads last month
33
Safetensors
Model size
10.7B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including tianzhechu/VIRL-VL-Init