Collections
Discover the best community collections!
Collections including paper arxiv:2502.05664
-
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models
Paper • 2502.01584 • Published • 9 -
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging
Paper • 2502.05664 • Published • 22 -
Craw4LLM: Efficient Web Crawling for LLM Pretraining
Paper • 2502.13347 • Published • 27
-
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise
Paper • 2501.08331 • Published • 20 -
MangaNinja: Line Art Colorization with Precise Reference Following
Paper • 2501.08332 • Published • 57 -
GameFactory: Creating New Games with Generative Interactive Videos
Paper • 2501.08325 • Published • 64 -
DiffuEraser: A Diffusion Model for Video Inpainting
Paper • 2501.10018 • Published • 14
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 35 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 51 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 66 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 48
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 27 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 40 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 53 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
Paper • 2404.03543 • Published • 16 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper • 2406.11931 • Published • 63 -
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Paper • 2407.18901 • Published • 33 -
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
Paper • 2408.07060 • Published • 42
-
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 22 -
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 82 -
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
Paper • 2408.00764 • Published • 1
-
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 82 -
meta-llama/CodeLlama-7b-Instruct-hf
Text Generation • Updated • 30.6k • 43 -
Qwen/Qwen2.5-Coder-32B-Instruct
Text Generation • Updated • 137k • • 1.67k -
huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
Text Generation • Updated • 488 • 27
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 21 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69