Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.14502

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12, 2024 • 17
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 36
Emergent properties with repeated examples

Paper • 2410.07041 • Published Oct 9, 2024 • 8
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 70

about 11 hours ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 42
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 57

Cognitive Architecture

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Paper • 2405.11157 • Published May 18, 2024 • 29
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Paper • 2406.12034 • Published Jun 17, 2024 • 15
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 36
OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 78

about 10 hours ago

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 27
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 40
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 53
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs