Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.05664

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published 20 days ago • 22

Benchmarks and challenges

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Paper • 2502.01584 • Published 25 days ago • 9
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published 20 days ago • 22
Craw4LLM: Efficient Web Crawling for LLM Pretraining

Paper • 2502.13347 • Published 10 days ago • 27

Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

Paper • 2501.08331 • Published Jan 14 • 20
MangaNinja: Line Art Colorization with Precise Reference Following

Paper • 2501.08332 • Published Jan 14 • 57
GameFactory: Creating New Games with Generative Interactive Videos

Paper • 2501.08325 • Published Jan 14 • 64
DiffuEraser: A Diffusion Model for Video Inpainting

Paper • 2501.10018 • Published Jan 17 • 14

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 35
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 51
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 66
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 48

about 13 hours ago

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 27
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 40
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 53
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

Code Generation

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4, 2024 • 16
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 63
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26, 2024 • 33
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

Paper • 2408.07060 • Published Aug 13, 2024 • 42

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 51
Training Software Engineering Agents and Verifiers with SWE-Gym

Paper • 2412.21139 • Published Dec 30, 2024 • 22
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 82
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

Paper • 2408.00764 • Published Aug 1, 2024 • 1

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 82
meta-llama/CodeLlama-7b-Instruct-hf

Text Generation • Updated Mar 14, 2024 • 30.6k • 43
Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • Updated Jan 12 • 137k • • 1.67k
huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated

Text Generation • Updated Nov 25, 2024 • 488 • 27

Large Language Model (LLM) and NLP related papers.

about 14 hours ago

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 21
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 35
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published 20 days ago • 22

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs