Uploaded better trained version.

Files changed (3) hide show

README.md CHANGED Viewed

@@ -10,9 +10,11 @@ pipeline_tag: text-generation
 library_name: peft
 ---
 # GPT4chan 24B QLoRA
 This model is [mistralai/Mistral-Small-24B-Base-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501) QLoRA fine-tuned on [v2ray/4chan](https://huggingface.co/datasets/v2ray/4chan) using [QLoRA](https://github.com/LagPixelLOL/qlora).
-Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for 800 steps, which is approximately 1 epoch.
 ## Prompt Format
 ```
 board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>

 library_name: peft
 ---
 # GPT4chan 24B QLoRA
+![GPT4chan Banner](https://huggingface.co/v2ray/GPT4chan-24B-QLoRA/resolve/main/images/banner.avif)
 This model is [mistralai/Mistral-Small-24B-Base-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501) QLoRA fine-tuned on [v2ray/4chan](https://huggingface.co/datasets/v2ray/4chan) using [QLoRA](https://github.com/LagPixelLOL/qlora).
+Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for 4000 steps, which is approximately 5 epochs.
 ## Prompt Format
 ```
 board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cb3650f8ea27501b12953d746b2507b4a76e986f907766d0e565a7a39ba2ebf2
 size 3458585544

 version https://git-lfs.github.com/spec/v1
+oid sha256:763fa36edb7a6aa96ad98855ae7b89837559a0e7aad7b996512a1e72e8e91f9a
 size 3458585544

images/banner.avif ADDED Viewed