5 44 43

Pengxiang Li

pengxiang

pixeli

AI & ML interests

Video generation, Image editing, AD

Recent Activity

upvoted a paper 3 days ago

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

upvoted a paper 7 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

upvoted a paper 11 days ago

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

View all activity

Organizations

None yet

pengxiang's activity

upvoted a paper 3 days ago

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Paper • 2501.12326 • Published 4 days ago • 45

upvoted a paper 7 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 9 days ago • 65

upvoted a paper 11 days ago

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Paper • 2501.06842 • Published 14 days ago • 15

upvoted a paper 15 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 18 days ago • 248

upvoted 2 papers 17 days ago

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

Paper • 2501.04575 • Published 17 days ago • 23

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 17 days ago • 89

upvoted a paper 23 days ago

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published 29 days ago • 81

upvoted 2 papers about 1 month ago

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Paper • 2412.13795 • Published Dec 18, 2024 • 19

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 75

upvoted a paper about 2 months ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 79

upvoted 3 papers 2 months ago

DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting

Paper • 2411.17223 • Published Nov 26, 2024 • 5

MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

Paper • 2411.13807 • Published Nov 21, 2024 • 11

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

Paper • 2411.08380 • Published Nov 13, 2024 • 25

upvoted a collection 2 months ago

🍃 MINT-1T

Collection

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 58

upvoted 2 papers 4 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 136

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6, 2024 • 44

upvoted a paper 5 months ago

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 123

upvoted 3 papers 6 months ago