fblgit commited on
Commit
7b5390a
1 Parent(s): 4fe75a4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ datasets:
4
+ - Magpie-Align/Magpie-Qwen2.5-Pro-300K-Filtered
5
+ base_model:
6
+ - Qwen/Qwen2.5-7B-Instruct
7
+ library_name: peft
8
+ tags:
9
+ - generated_from_trainer
10
+ ---
11
+ # cybertron-v4-qw7B-MGS
12
+
13
+ Introducing: **cybertron-v4** based on Qwen2.5 7B
14
+ SFT over Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
15
+
16
+ ## Training procedure
17
+ 1 Epoch as usual.
18
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
19
+ ### Training hyperparameters
20
+
21
+ The following hyperparameters were used during training:
22
+ - seed: 42
23
+ - distributed_type: multi-GPU
24
+ - num_devices: 8
25
+ - total_train_batch_size: 128
26
+ - total_eval_batch_size: 16
27
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
28
+ - num_epochs: 1
29
+
30
+ ### Training results
31
+
32
+ | Training Loss | Epoch | Step | Validation Loss |
33
+ |:-------------:|:------:|:----:|:---------------:|
34
+ | 0.7405 | 0.0007 | 1 | 0.5760 |
35
+ | 0.6146 | 0.0502 | 71 | 0.5045 |
36
+ | 0.5908 | 0.1003 | 142 | 0.4930 |
37
+ | 0.5669 | 0.1505 | 213 | 0.4854 |
38
+ | 0.5575 | 0.2007 | 284 | 0.4811 |
39
+ | 0.535 | 0.2508 | 355 | 0.4765 |
40
+ | 0.5161 | 0.3010 | 426 | 0.4736 |
41
+ | 0.5268 | 0.3511 | 497 | 0.4726 |
42
+ | 0.5119 | 0.4013 | 568 | 0.4701 |
43
+ | 0.5329 | 0.4515 | 639 | 0.4687 |
44
+ | 0.5167 | 0.5016 | 710 | 0.4673 |
45
+ | 0.5105 | 0.5518 | 781 | 0.4660 |
46
+ | 0.5203 | 0.6020 | 852 | 0.4653 |
47
+ | 0.5035 | 0.6521 | 923 | 0.4646 |
48
+ | 0.4903 | 0.7023 | 994 | 0.4641 |
49
+ | 0.5031 | 0.7525 | 1065 | 0.4628 |
50
+ | 0.5147 | 0.8026 | 1136 | 0.4629 |
51
+ | 0.5037 | 0.8528 | 1207 | 0.4620 |
52
+ | 0.5029 | 0.9029 | 1278 | 0.4620 |
53
+ | 0.492 | 0.9531 | 1349 | 0.4621 |
54
+
55
+
56
+ ### Framework versions
57
+
58
+ - PEFT 0.13.2
59
+ - Transformers 4.45.2
60
+ - Pytorch 2.3.0+cu121
61
+ - Datasets 3.0.1
62
+ - Tokenizers 0.20.1