5 40 37

Yongxin Guo

Yongxin-Guo

https://gyxxyg.github.io/yongxinguo/

gyxxyg

AI & ML interests

None yet

Recent Activity

new activity 5 days ago

Yongxin-Guo/TRACE:Upload llava-mt.json

upvoted a paper 10 days ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

upvoted a paper 11 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

View all activity

Organizations

Yongxin-Guo's activity

New activity in Yongxin-Guo/TRACE 5 days ago

Upload llava-mt.json

#2 opened 5 days ago by

jyliu

upvoted a paper 10 days ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 12 days ago • 51

upvoted a paper 11 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 13 days ago • 101

upvoted a paper 25 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 27 days ago • 273

liked a model about 1 month ago

deepseek-ai/DeepSeek-V3

Text Generation • Updated 17 days ago • 1.22M • • 3.32k

upvoted 3 papers about 1 month ago

New activity in Yongxin-Guo/TRACE about 2 months ago

Missing ${SPLIT}.caption_coco_format.json in dense_video_caption/ActivityNet_Captions

#1 opened about 2 months ago by

fghdy

updated a dataset about 2 months ago

Yongxin-Guo/TRACE

Preview • Updated 5 days ago • 140 • 3

upvoted 10 papers about 2 months ago

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published Dec 19, 2024 • 51

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 91

Autoregressive Video Generation without Vector Quantization

Paper • 2412.14169 • Published Dec 18, 2024 • 14

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 126

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 49

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published Dec 19, 2024 • 73

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 345

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 91

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 90

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published Dec 13, 2024 • 15