kmorrow1
's Collections
My Collection
updated
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
•
2501.04519
•
Published
•
244
Learning an evolved mixture model for task-free continual learning
Paper
•
2207.05080
•
Published
•
1
EVOLvE: Evaluating and Optimizing LLMs For Exploration
Paper
•
2410.06238
•
Published
•
1
Smaller Language Models Are Better Instruction Evolvers
Paper
•
2412.11231
•
Published
•
27
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page
Understanding and Grounding?
Paper
•
2404.05955
•
Published
An Evolved Universal Transformer Memory
Paper
•
2410.13166
•
Published
•
3
AgentGym: Evolving Large Language Model-based Agents across Diverse
Environments
Paper
•
2406.04151
•
Published
•
19
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table
Understanding
Paper
•
2401.04398
•
Published
•
23
EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific
Evaluations
Paper
•
2410.22821
•
Published
•
1
Learning Evolving Tools for Large Language Models
Paper
•
2410.06617
•
Published
•
2
PortLLM: Personalizing Evolving Large Language Models with Training-Free
and Portable Model Patches
Paper
•
2410.10870
•
Published
•
1
Generating and Evolving Reward Functions for Highway Driving with Large
Language Models
Paper
•
2406.10540
•
Published
•
1
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and
Evolution
Paper
•
2410.16256
•
Published
•
60
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper
•
2407.09435
•
Published
•
22
GAVEL: Generating Games Via Evolution and Language Models
Paper
•
2407.09388
•
Published
•
16
Reward Steering with Evolutionary Heuristics for Decoding-time Alignment
Paper
•
2406.15193
•
Published
•
14
Evolutionary Optimization of Model Merging Recipes
Paper
•
2403.13187
•
Published
•
51
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
•
2501.04682
•
Published
•
89
BoostStep: Boosting mathematical capability of Large Language Models via
improved single-step reasoning
Paper
•
2501.03226
•
Published
•
37
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation
Models
Paper
•
2501.00316
•
Published
•
22
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper
•
2501.05366
•
Published
•
78
URSA: Understanding and Verifying Chain-of-thought Reasoning in
Multimodal Mathematics
Paper
•
2501.04686
•
Published
•
50
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning
and Reflection
Paper
•
2501.04575
•
Published
•
23
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper
•
2412.20993
•
Published
•
35
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via
Collective Monte Carlo Tree Search
Paper
•
2412.18319
•
Published
•
37
Token-Budget-Aware LLM Reasoning
Paper
•
2412.18547
•
Published
•
45
B-STaR: Monitoring and Balancing Exploration and Exploitation in
Self-Taught Reasoners
Paper
•
2412.17256
•
Published
•
45
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper
•
2411.17465
•
Published
•
79
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Paper
•
2412.04454
•
Published
•
59
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web
Tutorials
Paper
•
2412.09605
•
Published
•
28
OmniManip: Towards General Robotic Manipulation via Object-Centric
Interaction Primitives as Spatial Constraints
Paper
•
2501.03841
•
Published
•
49
Agents for self-driving laboratories applied to quantum computing
Paper
•
2412.07978
•
Published
•
1
Towards Scientific Discovery with Generative AI: Progress,
Opportunities, and Challenges
Paper
•
2412.11427
•
Published
•
1
AEGIS: An Agent-based Framework for General Bug Reproduction from Issue
Descriptions
Paper
•
2411.18015
•
Published
•
1
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper
•
2501.04306
•
Published
•
33
Using Generative AI and Multi-Agents to Provide Automatic Feedback
Paper
•
2411.07407
•
Published
•
1
Designing Reliable Experiments with Generative Agent-Based Modeling: A
Comprehensive Guide Using Concordia by Google DeepMind
Paper
•
2411.07038
•
Published
•
1
Agent Laboratory: Using LLM Agents as Research Assistants
Paper
•
2501.04227
•
Published
•
80
A Multi-AI Agent System for Autonomous Optimization of Agentic AI
Solutions via Iterative Refinement and LLM-Driven Feedback Loops
Paper
•
2412.17149
•
Published
•
1
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Paper
•
2501.05707
•
Published
•
19
Enabling Scalable Oversight via Self-Evolving Critic
Paper
•
2501.05727
•
Published
•
69
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Paper
•
2501.01895
•
Published
•
49
Understanding Self-Predictive Learning for Reinforcement Learning
Paper
•
2212.03319
•
Published
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
Paper
•
2405.20233
•
Published
•
6
Paper
•
2402.09470
•
Published
•
11
Vid2Robot: End-to-end Video-conditioned Policy Learning with
Cross-Attention Transformers
Paper
•
2403.12943
•
Published
•
15
TinyFusion: Diffusion Transformers Learned Shallow
Paper
•
2412.01199
•
Published
•
14
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Paper
•
2501.06186
•
Published
•
59
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning
Paper
•
2406.09170
•
Published
•
26
Demystifying Domain-adaptive Post-training for Financial LLMs
Paper
•
2501.04961
•
Published
•
11
Enhancing Human-Like Responses in Large Language Models
Paper
•
2501.05032
•
Published
•
49
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
•
2501.07301
•
Published
•
85
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
•
2501.08313
•
Published
•
268
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
Paper
•
2501.08292
•
Published
•
16
PokerBench: Training Large Language Models to become Professional Poker
Players
Paper
•
2501.08328
•
Published
•
13
Tarsier2: Advancing Large Vision-Language Models from Detailed Video
Description to Comprehensive Video Understanding
Paper
•
2501.07888
•
Published
•
13
Potential and Perils of Large Language Models as Judges of Unstructured
Textual Data
Paper
•
2501.08167
•
Published
•
6
Tensor Product Attention Is All You Need
Paper
•
2501.06425
•
Published
•
75
Transformer^2: Self-adaptive LLMs
Paper
•
2501.06252
•
Published
•
50
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Paper
•
2501.06842
•
Published
•
15
WebWalker: Benchmarking LLMs in Web Traversal
Paper
•
2501.07572
•
Published
•
19
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
•
2501.06458
•
Published
•
29
Evaluating Sample Utility for Data Selection by Mimicking Model Weights
Paper
•
2501.06708
•
Published
•
5
ChemAgent: Self-updating Library in Large Language Models Improves
Chemical Reasoning
Paper
•
2501.06590
•
Published
•
8
OmniThink: Expanding Knowledge Boundaries in Machine Writing through
Thinking
Paper
•
2501.09751
•
Published
•
45
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
Paper
•
2501.08617
•
Published
•
10
Learnings from Scaling Visual Tokenizers for Reconstruction and
Generation
Paper
•
2501.09755
•
Published
•
33
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
•
2501.09686
•
Published
•
35
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper
•
2501.09747
•
Published
•
22
Agent-R: Training Language Model Agents to Reflect via Iterative
Self-Training
Paper
•
2501.11425
•
Published
•
68
Demons in the Detail: On Implementing Load Balancing Loss for Training
Specialized Mixture-of-Expert Models
Paper
•
2501.11873
•
Published
•
57
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper
•
2501.12380
•
Published
•
71
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper
•
2501.12326
•
Published
•
41
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward
Model
Paper
•
2501.12368
•
Published
•
31
Reasoning Language Models: A Blueprint
Paper
•
2501.11223
•
Published
•
19
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Paper
•
2501.11733
•
Published
•
22
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in
Realistic Environments
Paper
•
2501.10893
•
Published
•
20
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and
Refinement
Paper
•
2501.12273
•
Published
•
14
GPS as a Control Signal for Image Generation
Paper
•
2501.12390
•
Published
•
12
The Geometry of Tokens in Internal Representations of Large Language
Models
Paper
•
2501.10573
•
Published
•
6
Panoramic Interests: Stylistic-Content Aware Personalized Headline
Generation
Paper
•
2501.11900
•
Published
•
3
GameFactory: Creating New Games with Generative Interactive Videos
Paper
•
2501.08325
•
Published
•
57
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper
•
2501.09781
•
Published
•
19
SEAL: Entangled White-box Watermarks on Low-Rank Adaptation
Paper
•
2501.09284
•
Published
•
7
Evolving Deeper LLM Thinking
Paper
•
2501.09891
•
Published
•
93
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper
•
2501.10120
•
Published
•
36
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs)
More Self-Confident Even When They Are Wrong
Paper
•
2501.09775
•
Published
•
26
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial
Network for High-Fidelity Speech Super-Resolution
Paper
•
2501.10045
•
Published
•
8
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling
under Long-Context Scenario
Paper
•
2501.10132
•
Published
•
7
Do generative video models learn physical principles from watching
videos?
Paper
•
2501.09038
•
Published
•
29
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient
Simulators
Paper
•
2501.09484
•
Published
•
19
The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating
Large Language Models
Paper
•
2501.09653
•
Published
•
12
Towards Best Practices for Open Datasets for LLM Training
Paper
•
2501.08365
•
Published
•
47
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents
Paper
•
2501.08828
•
Published
•
28
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot
Paper
•
2501.09012
•
Published
•
10
Ouroboros-Diffusion: Exploring Consistent Content Generation in
Tuning-free Long Video Diffusion
Paper
•
2501.09019
•
Published
•
12
RepVideo: Rethinking Cross-Layer Representation for Video Generation
Paper
•
2501.08994
•
Published
•
15
XMusic: Towards a Generalized and Controllable Symbolic Music Generation
Framework
Paper
•
2501.08809
•
Published
•
10
Trusted Machine Learning Models Unlock Private Inference for Problems
Currently Infeasible with Cryptography
Paper
•
2501.08970
•
Published
•
6
Parameter-Inverted Image Pyramid Networks for Visual Perception and
Multimodal Understanding
Paper
•
2501.07783
•
Published
•
7
Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous
Sensors via Language Grounding
Paper
•
2501.04693
•
Published
•
2
MINIMA: Modality Invariant Image Matching
Paper
•
2412.19412
•
Published
•
3
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token
Marks
Paper
•
2501.08326
•
Published
•
31
A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction
Following
Paper
•
2501.08187
•
Published
•
24
Enhancing Automated Interpretability with Output-Centric Feature
Descriptions
Paper
•
2501.08319
•
Published
•
10
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for
LLM Training
Paper
•
2501.08197
•
Published
•
7
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language
Datasets for African Languages
Paper
•
2501.08284
•
Published
•
6
MatchAnything: Universal Cross-Modality Image Matching with Large-Scale
Pre-Training
Paper
•
2501.07556
•
Published
•
5