Update README.md
Browse files
README.md
CHANGED
@@ -15,12 +15,16 @@ Llama-3.1-8B-ArliAI-RPMax-v1.1 is a variant of the Meta-Llama-3.1-8B model, trai
|
|
15 |
|
16 |
v1.1 is just a small fix to not train and save the embeddings layer, since v1.0 had the lm_head unnecessarily trained on accident.
|
17 |
|
|
|
|
|
18 |
### Training Details
|
19 |
|
20 |
* **Sequence Length**: 8192
|
21 |
* **Training Duration**: Approximately 1 day on 2x3090Ti
|
22 |
* **Epochs**: 1 epoch training for minimized repetition sickness
|
23 |
* **LORA**: 64-rank 128-alpha, resulting in ~2% trainable weights
|
|
|
|
|
24 |
|
25 |
## Quantization
|
26 |
|
|
|
15 |
|
16 |
v1.1 is just a small fix to not train and save the embeddings layer, since v1.0 had the lm_head unnecessarily trained on accident.
|
17 |
|
18 |
+
You can access the model at https://arliai.com and ask questions at https://www.reddit.com/r/ArliAI/
|
19 |
+
|
20 |
### Training Details
|
21 |
|
22 |
* **Sequence Length**: 8192
|
23 |
* **Training Duration**: Approximately 1 day on 2x3090Ti
|
24 |
* **Epochs**: 1 epoch training for minimized repetition sickness
|
25 |
* **LORA**: 64-rank 128-alpha, resulting in ~2% trainable weights
|
26 |
+
* **Learning Rate**: 0.00001
|
27 |
+
* **Gradient accumulation**: Very low 32 for better learning.
|
28 |
|
29 |
## Quantization
|
30 |
|