Edward Beeching's picture

Edward Beeching

edbeeching

·

https://edbeeching.github.io/

edbeeching

AI & ML interests

None yet

Recent Activity

published a Space 3 days ago

open-r1/open-r1-eval-leaderboard

updated a Space 3 days ago

open-r1/open-r1-eval-leaderboard

published a model 4 days ago

edbeeching/DeepSeek-R1-Distill-Qwen-1.5B-GRPO

View all activity

Articles

Open-R1: Update #1

How NuminaMath Won the 1st AIMO Progress Prize

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Vision Language Models Explained

Constitutional AI with Open LLMs

Preference Tuning LLMs with Direct Preference Optimization Methods

Can foundation models label data like humans?

Creating a Coding Assistant with StarCoder

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

Train your first Decision Transformer

Introducing Decision Transformers on Hugging Face 🤗

Organizations

edbeeching's activity

upvoted an article 7 months ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

Jul 11, 2024

• 112

upvoted a paper about 1 year ago

A General Theoretical Paradigm to Understand Learning from Human Preferences

Paper • 2310.12036 • Published Oct 18, 2023 • 13

upvoted a collection about 1 year ago

Reward models on the hub

UNMAINTAINED: See RewardBench... A place to collect reward models, an often not released artifact of RLHF. • 18 items • Updated Apr 13, 2024 • 25

upvoted a paper about 1 year ago

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

Paper • 2311.10702 • Published Nov 17, 2023 • 19

upvoted 2 papers over 1 year ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 123

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Paper • 2306.01116 • Published Jun 1, 2023 • 33