arxiv:2501.10799
Yuandong Tian
tydsh
AI & ML interests
Reinforcement Learning, Optimization, Representation Learning
Recent Activity
authored
a paper
about 14 hours ago
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary
Feedback
authored
a paper
about 2 months ago
Training Large Language Models to Reason in a Continuous Latent Space
authored
a paper
7 months ago
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive
Low-Rank Gradients
Organizations
None yet
Papers
19
models
None public yet
datasets
None public yet