SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding Paper • 2408.15545 • Published 23 days ago • 32
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 111
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data? Paper • 2407.16607 • Published Jul 23 • 21
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries Paper • 2406.12824 • Published Jun 18 • 20
HARE: HumAn pRiors, a key to small language model Efficiency Paper • 2406.11410 • Published Jun 17 • 38
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack Paper • 2406.10149 • Published Jun 14 • 48
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation Paper • 2406.08392 • Published Jun 12 • 18
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning Paper • 2406.09170 • Published Jun 13 • 24
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated Jul 17 • 156
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published May 23 • 26
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws Paper • 2401.00448 • Published Dec 31, 2023 • 27
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 17
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 53
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline Paper • 2311.13073 • Published Nov 22, 2023 • 56
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion Paper • 2310.03502 • Published Oct 5, 2023 • 77