INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge Paper • 2411.19799 • Published Nov 29, 2024 • 11
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning Paper • 2410.10801 • Published Oct 14, 2024
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper • 2408.10914 • Published Aug 20, 2024 • 42
LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives Paper • 2407.01490 • Published Jul 1, 2024 • 1
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm Paper • 2406.18682 • Published Jun 26, 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings Paper • 2410.15522 • Published Oct 20, 2024 • 12
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published May 23, 2024 • 30
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Paper • 2402.14740 • Published Feb 22, 2024 • 13
InPars: Data Augmentation for Information Retrieval using Large Language Models Paper • 2202.05144 • Published Feb 10, 2022
InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval Paper • 2301.01820 • Published Jan 4, 2023 • 1
No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval Paper • 2206.02873 • Published Jun 6, 2022
mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset Paper • 2108.13897 • Published Aug 31, 2021
When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale Paper • 2309.04564 • Published Sep 8, 2023 • 16
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation Paper • 2310.14424 • Published Oct 22, 2023
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning Paper • 2402.06619 • Published Feb 9, 2024 • 55
Elo Uncovered: Robustness and Best Practices in Language Model Evaluation Paper • 2311.17295 • Published Nov 29, 2023
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12, 2024 • 47