Collections

4

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 142
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Paper • 2305.07759 • Published May 12, 2023 • 33
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28 • 94

1

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29 • 47
Best Practices and Lessons Learned on Synthetic Data for Language Models

Paper • 2404.07503 • Published Apr 11 • 29
WizardLM: Empowering Large Language Models to Follow Complex Instructions

Paper • 2304.12244 • Published Apr 24, 2023 • 13
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

Textbooks Are All You Need

Textbooks Are All You Need II: phi-1.5 technical report

TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Best Practices and Lessons Learned on Synthetic Data for Language Models

WizardLM: Empowering Large Language Models to Follow Complex Instructions

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

StarCoder: may the source be with you!

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

SantaCoder: don't reach for the stars!

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Attention Is All You Need

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

RoBERTa: A Robustly Optimized BERT Pretraining Approach

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Understanding LLMs: A Comprehensive Overview from Training to Inference

Learning To Teach Large Language Models Logical Reasoning

ChipNeMo: Domain-Adapted LLMs for Chip Design

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Creative Robot Tool Use with Large Language Models

CodeCoT and Beyond: Learning to Program and Test like a Developer

Lemur: Harmonizing Natural Language and Code for Language Agents

CodePlan: Repository-level Coding using LLMs and Planning

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs

Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

Evaluating the Robustness to Instructions of Large Language Models

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models

Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs

Table-GPT: Table-tuned GPT for Diverse Table Tasks

A Zero-Shot Language Agent for Computer Control with Structured Reflection

The Consensus Game: Language Model Generation via Equilibrium Search

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

WizardLM: Empowering Large Language Models to Follow Complex Instructions