v2ray commited on
Commit
0c36163
·
verified ·
1 Parent(s): b37ddec

Uploaded better trained version.

Browse files
Files changed (3) hide show
  1. README.md +3 -1
  2. adapter_model.safetensors +1 -1
  3. images/banner.avif +0 -0
README.md CHANGED
@@ -10,9 +10,11 @@ pipeline_tag: text-generation
10
  library_name: peft
11
  ---
12
  # GPT4chan 24B QLoRA
 
 
13
  This model is [mistralai/Mistral-Small-24B-Base-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501) QLoRA fine-tuned on [v2ray/4chan](https://huggingface.co/datasets/v2ray/4chan) using [QLoRA](https://github.com/LagPixelLOL/qlora).
14
 
15
- Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for 800 steps, which is approximately 1 epoch.
16
  ## Prompt Format
17
  ```
18
  board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>
 
10
  library_name: peft
11
  ---
12
  # GPT4chan 24B QLoRA
13
+ ![GPT4chan Banner](https://huggingface.co/v2ray/GPT4chan-24B-QLoRA/resolve/main/images/banner.avif)
14
+
15
  This model is [mistralai/Mistral-Small-24B-Base-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501) QLoRA fine-tuned on [v2ray/4chan](https://huggingface.co/datasets/v2ray/4chan) using [QLoRA](https://github.com/LagPixelLOL/qlora).
16
 
17
+ Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for 4000 steps, which is approximately 5 epochs.
18
  ## Prompt Format
19
  ```
20
  board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cb3650f8ea27501b12953d746b2507b4a76e986f907766d0e565a7a39ba2ebf2
3
  size 3458585544
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:763fa36edb7a6aa96ad98855ae7b89837559a0e7aad7b996512a1e72e8e91f9a
3
  size 3458585544
images/banner.avif ADDED