Update README.md
Browse files
README.md
CHANGED
@@ -39,6 +39,10 @@ I wasn't entirely too sure, since if nephilim v3 is anything to go by, it was pr
|
|
39 |
|
40 |
If you're looking for a mistral nemo 12B model instead, I HIGHLY recommend Mistral Nemo Gutenberg v2 by nbeerbower. It's head and shoulders above the many other mistral nemo finetunes I've tried (the first version of mistral nemo gutenburg, romulus simpo, and magnum mini 1.1 being close second favorites).
|
41 |
|
|
|
|
|
|
|
|
|
42 |
## Why is it 10b??
|
43 |
|
44 |
See https://github.com/arcee-ai/mergekit/issues/390
|
|
|
39 |
|
40 |
If you're looking for a mistral nemo 12B model instead, I HIGHLY recommend Mistral Nemo Gutenberg v2 by nbeerbower. It's head and shoulders above the many other mistral nemo finetunes I've tried (the first version of mistral nemo gutenburg, romulus simpo, and magnum mini 1.1 being close second favorites).
|
41 |
|
42 |
+
## Why Gutenberg?
|
43 |
+
|
44 |
+
We use gutenberg 9b, which is finetuned over SPPO cause of how good the gutenberg dpo dataset is. It's not a german dataset like it sounds, it's a dataset based off of project gutenberg, a public domain collection of popular classic fictions, like Moby Dick, for example. This data set also uses LLM generated responses as the negative prompt, to train models to not sounds so much like AI, or your typical LLMs, and more like actual humans (based on the creative works from project gutenberg). This is quality writing, hence quality data, not just random RP logs, or synthetic data. This dataset, when trained for 3 epochs has shown to increase llama 3 average scores on the old openllm leaderboard, from 72 to 73 (nobody really got higher than 73 before the board revamp), and has already again, proven to increase average openllm 2 leaderboard scores, increasing the average score from 21.47 to 22.61, improving on sppo. Thats a huge improvement. That said, I didnt like gutenberg 9b more than original sppo in real use, felt a tiny bit overfit, so we tried this merge. Did not expect much, because of neph v3 turning out worse than either of its parents, but this surprisingly came out great.
|
45 |
+
|
46 |
## Why is it 10b??
|
47 |
|
48 |
See https://github.com/arcee-ai/mergekit/issues/390
|