LLM as a Judge Collection Curated resources that support the use of LLMs to serve as automatic evaluators of other LLM outputs. • 18 items • Updated 10 days ago • 21
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks Paper • 2309.17167 • Published Sep 29, 2023 • 1
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 116
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • Apr 29 • 29
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 Paper • 2312.16171 • Published Dec 26, 2023 • 34
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
DRAGON Models Collection Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models • 23 items • Updated 13 days ago • 44