Post
🚀🧙🏼♂️Introducing OpenHermesPreferences: the largest open AI feedback dataset for RLHF & DPO
> Using LLMs to improve other LLMs, at scale!
Built in collaboration with the H4 Hugging Face team, it's a 1M preferences dataset on top of the amazing @teknium 's dataset.
Dataset:
argilla/OpenHermesPreferences
The dataset is another example of open collaboration:
> The H4 team created responses with Mixtral using llm-swarm
> Argilla created responses with NousResearch Hermes-2-Yi-34B using distilabel
> The H4 ranked these responses + original response with PairRM from AllenAI, University of Southern California, Zhejiang University ( @yuchenlin @DongfuTingle and colleagues)
We hope this dataset will help the community's research efforts towards understanding the role of AI feedback for LLM alignment.
We're particularly excited about the ability of filtering specific subsets to improve LLM skills like math or reasoning.
Here's how easy it is to filter by subset:
As usual, all the scripts to reproduce this work are available and open to the community!
argilla/OpenHermesPreferences
So fun collab between @vwxyzjn , @plaguss , @kashif , @philschmid & @lewtun !
Open Source AI FTW!
> Using LLMs to improve other LLMs, at scale!
Built in collaboration with the H4 Hugging Face team, it's a 1M preferences dataset on top of the amazing @teknium 's dataset.
Dataset:
argilla/OpenHermesPreferences
The dataset is another example of open collaboration:
> The H4 team created responses with Mixtral using llm-swarm
> Argilla created responses with NousResearch Hermes-2-Yi-34B using distilabel
> The H4 ranked these responses + original response with PairRM from AllenAI, University of Southern California, Zhejiang University ( @yuchenlin @DongfuTingle and colleagues)
We hope this dataset will help the community's research efforts towards understanding the role of AI feedback for LLM alignment.
We're particularly excited about the ability of filtering specific subsets to improve LLM skills like math or reasoning.
Here's how easy it is to filter by subset:
ds = load_dataset("HuggingFaceH4/OpenHermesPreferences", split="train")
# Get the categories of the source dataset
# ['airoboros2.2', 'CamelAI', 'caseus_custom', ...]
sources = ds.unique("source")
# Filter for a subset
ds_filtered = ds.filter(lambda x : x["source"] in ["metamath", "EvolInstruct_70k"], num_proc=6)
As usual, all the scripts to reproduce this work are available and open to the community!
argilla/OpenHermesPreferences
So fun collab between @vwxyzjn , @plaguss , @kashif , @philschmid & @lewtun !
Open Source AI FTW!