SakanaAI
/

DiscoPOP-zephyr-7b-gemma

Text Generation

alignment-handbook

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chrlu commited on Jun 13, 2024

Commit

87ce9c7

·

verified ·

1 Parent(s): fb3f2bc

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ model-index:
 This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-gemma-sft-v0.1](https://huggingface.co/HuggingFaceH4/zephyr-7b-gemma-sft-v0.1) on the argilla/dpo-mix-7k dataset.
-This model is from the paper ["Discovering Preference Optimization Algorithms with and for Large Language Models"](https://huggingface.co/SakanaAI/DiscoPOP-zephyr-7b-gemma/discussions)
 Read the [blog post on it here!](https://sakana.ai/)

 This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-gemma-sft-v0.1](https://huggingface.co/HuggingFaceH4/zephyr-7b-gemma-sft-v0.1) on the argilla/dpo-mix-7k dataset.
+This model is from the paper ["Discovering Preference Optimization Algorithms with and for Large Language Models"](https://arxiv.org/abs/2406.08414)
 Read the [blog post on it here!](https://sakana.ai/)