cgato
/

Nemo-12b-Humanize-KTO-v0.1

Safetensors

mistral

Model card Files Files and versions Community

cgato commited on Jan 19

Commit

c98c332

verified ·

1 Parent(s): c47c3aa

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -2,9 +2,9 @@
 license: cc-by-nc-4.0
 ---
 ## There's some kind of gguf tokenization issue where its not properly handling the <|im_end|> EOS token and instead breaks it into multiple smaller ones which degrades performance for gguf users. Need to submit an issue ticket. Unsure if this impacts other quantization formats.
-## If possible, please use the safetensors directly for best performance.
-This is the first public release of the Humanize model I've been working on. This version of the model is focused on roleplay and tends to give longer responses than the Chat version.
 ## Goals
 The goal of this model is to make something that feels different from other models out there. I wanted to accomplish this in two ways, first use as little synthetic data as possible, and second, by hand judging KTO samples to avoid common AI phrases and repetition. It has seen very very little synthetic data and without RL is unusable. The RL dataset was built by pulling responses from the model itself and then having a human (me) review each response.
 The KTO dataset for this model consisted of 7122 human judged responses. Of the 7122 responses, 2398 were accepted and 4724 were rejected. To balance this, the KTO finetune was run using a 2.5x weight for desirable responses.
@@ -30,6 +30,8 @@ Note that if you tell it that it's an AI and ask it for its name, it will call i
 * GGUF currently does not tokenize the end token correctly, resulting in lower performance for GGUF users.
 * Male characters are not thoroughly represented in the RL dataset.
 ## Examples Roleplay
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64dd7cda3d6b954bf7cdd922/Ms2sV2xJUDkJKxobx7hQG.png)

 license: cc-by-nc-4.0
 ---
 ## There's some kind of gguf tokenization issue where its not properly handling the <|im_end|> EOS token and instead breaks it into multiple smaller ones which degrades performance for gguf users. Need to submit an issue ticket. Unsure if this impacts other quantization formats.
+## If possible, please use the safetensors directly for best performance. I've not yet tested if Exllama2 tokenizes correctly.
+This is the first public release of the Humanize model I've been working on. That way folks have a stable repo to download from and can expect some consistient level of performance when downloading.
 ## Goals
 The goal of this model is to make something that feels different from other models out there. I wanted to accomplish this in two ways, first use as little synthetic data as possible, and second, by hand judging KTO samples to avoid common AI phrases and repetition. It has seen very very little synthetic data and without RL is unusable. The RL dataset was built by pulling responses from the model itself and then having a human (me) review each response.
 The KTO dataset for this model consisted of 7122 human judged responses. Of the 7122 responses, 2398 were accepted and 4724 were rejected. To balance this, the KTO finetune was run using a 2.5x weight for desirable responses.
 * GGUF currently does not tokenize the end token correctly, resulting in lower performance for GGUF users.
 * Male characters are not thoroughly represented in the RL dataset.
+## All samples taken at bfloat16, Transformers with FA2.
 ## Examples Roleplay
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64dd7cda3d6b954bf7cdd922/Ms2sV2xJUDkJKxobx7hQG.png)