Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.18057

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Paper • 2405.08748 • Published May 14, 2024 • 24
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16, 2024 • 29
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 131
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20, 2024 • 38

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 28
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 14
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 32

Papers - Fine-tuning - DPO

Refer to additional papers: https://link.springer.com/article/10.1007/s10994-014-5458-8 and https://link.springer.com/article/10.1007/BF00992696

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 53
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14, 2024 • 6
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28, 2024 • 41
Dueling RL: Reinforcement Learning with Trajectory Preferences

Paper • 2111.04850 • Published Nov 8, 2021 • 2

Papers - Fine-tuning - LoRA

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data

Paper • 2304.08247 • Published Apr 14, 2023 • 2
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 32
WavLLM: Towards Robust and Adaptive Speech Large Language Model

Paper • 2404.00656 • Published Mar 31, 2024 • 11

Papers - Fine-tuning

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 50
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44

CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation

Paper • 2401.01275 • Published Jan 2, 2024 • 1
Introducing v0.5 of the AI Safety Benchmark from MLCommons

Paper • 2404.12241 • Published Apr 18, 2024 • 11
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 121
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

Paper • 2406.12624 • Published Jun 18, 2024 • 37

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 147
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20, 2024 • 13
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 56
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 47

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs